Inclusion of historical training data improved the genomics-based prediction of performance of maize hybrids, the extent depending on the phenotypic trait and genotype-by-year interaction.



Prediction of hybrid performance using existing phenotypic data on previous hybrids combined with molecular data collected on the parent lines allows to identify the most promising candidates from a huge number of possible hybrids at an early stage. Phenotypic data on yield and dry matter of 1970 grain maize hybrids from 19 years of a public breeding program were aggregated considering the underlying structure of factorial sets of hybrids. Pedigree records and 50 K SNP data were collected on their 170 Dent and 127 Flint parent lines. The performance of untested hybrids was predicted by best linear unbiased predictors (BLUP) on basis of pedigree or genomic data. For composition of training sets (TRN) and test sets (TST), three schemes for collecting factorials from specific years were employed which resulted in 490 scenarios. For each scenario, the predictive ability and genomic relationship between TRN and TST hybrids were determined. For extended TRNs, where earlier years were successively added to the TRN, the maximum relationship increased and the predictive ability improved, with the extent of the latter depending on the phenotypic trait and its genotype-by-year interaction. Genomic BLUP outperformed pedigree BLUP and better utilized the early years’ data, especially for prediction of hybrids from factorials in a more distant future. This study on hybrid prediction in grain maize illustrated that including historical phenotypic data for training, although consisting of less related genotypes, can improve genomic prediction and enables optimization of hybrid variety development.