Graph-based data selection for the construction of genomic prediction models
- PMID: 20479144
- PMCID: PMC2927770
- DOI: 10.1534/genetics.110.116426
Graph-based data selection for the construction of genomic prediction models
Abstract
Efficient genomic selection in animals or crops requires the accurate prediction of the agronomic performance of individuals from their high-density molecular marker profiles. Using a training data set that contains the genotypic and phenotypic information of a large number of individuals, each marker or marker allele is associated with an estimated effect on the trait under study. These estimated marker effects are subsequently used for making predictions on individuals for which no phenotypic records are available. As most plant and animal breeding programs are currently still phenotype driven, the continuously expanding collection of phenotypic records can only be used to construct a genomic prediction model if a dense molecular marker fingerprint is available for each phenotyped individual. However, as the genotyping budget is generally limited, the genomic prediction model can only be constructed using a subset of the tested individuals and possibly a genome-covering subset of the molecular markers. In this article, we demonstrate how an optimal selection of individuals can be made with respect to the quality of their available phenotypic data. We also demonstrate how the total number of molecular markers can be reduced while a maximum genome coverage is ensured. The third selection problem we tackle is specific to the construction of a genomic prediction model for a hybrid breeding program where only molecular marker fingerprints of the homozygous parents are available. We show how to identify the set of parental inbred lines of a predefined size that has produced the highest number of progeny. These three selection approaches are put into practice in a simulation study where we demonstrate how the trade-off between sample size and sample quality affects the prediction accuracy of genomic prediction models for hybrid maize.
Figures






Similar articles
-
Phenotypic Data from Inbred Parents Can Improve Genomic Prediction in Pearl Millet Hybrids.G3 (Bethesda). 2018 Jul 2;8(7):2513-2522. doi: 10.1534/g3.118.200242. G3 (Bethesda). 2018. PMID: 29794163 Free PMC article.
-
Genome properties and prospects of genomic prediction of hybrid performance in a breeding program of maize.Genetics. 2014 Aug;197(4):1343-55. doi: 10.1534/genetics.114.165860. Epub 2014 May 21. Genetics. 2014. PMID: 24850820 Free PMC article.
-
Resource allocation for maximizing prediction accuracy and genetic gain of genomic selection in plant breeding: a simulation experiment.G3 (Bethesda). 2013 Mar;3(3):481-91. doi: 10.1534/g3.112.004911. Epub 2013 Mar 1. G3 (Bethesda). 2013. PMID: 23450123 Free PMC article.
-
Genomic Selection in Plant Breeding: Methods, Models, and Perspectives.Trends Plant Sci. 2017 Nov;22(11):961-975. doi: 10.1016/j.tplants.2017.08.011. Epub 2017 Sep 28. Trends Plant Sci. 2017. PMID: 28965742 Review.
-
Whole-genome regression and prediction methods applied to plant and animal breeding.Genetics. 2013 Feb;193(2):327-45. doi: 10.1534/genetics.112.143313. Epub 2012 Jun 28. Genetics. 2013. PMID: 22745228 Free PMC article. Review.
Cited by
-
Large-scale sequestration of atmospheric carbon via plant roots in natural and agricultural ecosystems: why and how.Philos Trans R Soc Lond B Biol Sci. 2012 Jun 5;367(1595):1589-97. doi: 10.1098/rstb.2011.0244. Philos Trans R Soc Lond B Biol Sci. 2012. PMID: 22527402 Free PMC article. Review.
-
Across-years prediction of hybrid performance in maize using genomics.Theor Appl Genet. 2019 Apr;132(4):933-946. doi: 10.1007/s00122-018-3249-5. Epub 2018 Nov 29. Theor Appl Genet. 2019. PMID: 30498894
-
Training set optimization under population structure in genomic selection.Theor Appl Genet. 2015 Jan;128(1):145-58. doi: 10.1007/s00122-014-2418-4. Epub 2014 Nov 1. Theor Appl Genet. 2015. PMID: 25367380 Free PMC article.
-
Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.).Genetics. 2012 Oct;192(2):715-28. doi: 10.1534/genetics.112.141473. Epub 2012 Aug 3. Genetics. 2012. PMID: 22865733 Free PMC article.
-
Beyond Genomic Prediction: Combining Different Types of omics Data Can Improve Prediction of Hybrid Performance in Maize.Genetics. 2018 Apr;208(4):1373-1385. doi: 10.1534/genetics.117.300374. Epub 2018 Jan 23. Genetics. 2018. PMID: 29363551 Free PMC article.
References
-
- Asahiro, Y., K. Iwama, H. Tamaki and T. Tokuyama, 2000. Greedily finding a dense subgraph. Algorithmica 34 203–221.
-
- Battiti, R., and M. Protasi, 2001. Reactive local search for the maximum clique problem. Algorithmica 29 610–637.
-
- Bernardo, R., 1994. Prediction of maize single-cross performance using RFLPs and information from related hybrids. Crop Sci. 34 20–25.
-
- Bernardo, R., 1995. Genetic models for predicting maize single-cross performance in unbalanced yield trial data. Crop Sci. 35 141–147.
-
- Bernardo, R., 1996. Best linear unbiased prediction of the performance of crosses between untested maize inbreds. Crop Sci. 36 50–56.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials