Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 12;12(1):e0169606.
doi: 10.1371/journal.pone.0169606. eCollection 2017.

Optimizing Training Population Size and Genotyping Strategy for Genomic Prediction Using Association Study Results and Pedigree Information. A Case of Study in Advanced Wheat Breeding Lines

Affiliations

Optimizing Training Population Size and Genotyping Strategy for Genomic Prediction Using Association Study Results and Pedigree Information. A Case of Study in Advanced Wheat Breeding Lines

Fabio Cericola et al. PLoS One. .

Abstract

Wheat breeding programs generate a large amount of variation which cannot be completely explored because of limited phenotyping throughput. Genomic prediction (GP) has been proposed as a new tool which provides breeding values estimations without the need of phenotyping all the material produced but only a subset of it named training population (TP). However, genotyping of all the accessions under analysis is needed and, therefore, optimizing TP dimension and genotyping strategy is pivotal to implement GP in commercial breeding schemes. Here, we explored the optimum TP size and we integrated pedigree records and genome wide association studies (GWAS) results to optimize the genotyping strategy. A total of 988 advanced wheat breeding lines were genotyped with the Illumina 15K SNPs wheat chip and phenotyped across several years and locations for yield, lodging, and starch content. Cross-validation using the largest possible TP size and all the SNPs available after editing (~11k), yielded predictive abilities (rGP) ranging between 0.5-0.6. In order to explore the Training population size, rGP were computed using progressively smaller TP. These exercises showed that TP of around 700 lines were enough to yield the highest observed rGP. Moreover, rGP were calculated by randomly reducing the SNPs number. This showed that around 1K markers were enough to reach the highest observed rGP. GWAS was used to identify markers associated with the traits analyzed. A GWAS-based selection of SNPs resulted in increased rGP when compared with random selection and few hundreds SNPs were sufficient to obtain the highest observed rGP. For each of these scenarios, advantages of adding the pedigree information were shown. Our results indicate that moderate TP sizes were enough to yield high rGP and that pedigree information and GWAS results can be used to greatly optimize the genotyping strategy.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist. This research was partly funded by the commercial partner Nordic seed A/S. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Manhattan plots.
GWAS results for the three traits under analysis are displayed: a) Yield; b) Lodging; c) Starch content.
Fig 2
Fig 2. Prediction accuracy as a function of the training set size.
Results are displayed for the three traits under analysis: a) Yield; b) Lodging; c) Starch content. Three models were considered: a) A+I in blue; b) G+I in green; c) A+G+I in red. The color of the dots show if each rGP was significantly lower than the highest observed rGP obtained with each model.
Fig 3
Fig 3. Accuracy of rGP as a function of the number of randomly selected SNPs used to compute the GRM.
Results for the three traits under analysis are reported: a) Yield; b) Lodging; c) Starch content. Accuracies obtained with the A+I, G+I and A+G+I, models are represented in blue, green and red, respectively. The color of the dots show if each rGP was significantly lower than the highest observed rGP obtained with each model.
Fig 4
Fig 4. Accuracy of rGP as a function of the number of GWAS-based selected SNPs used to compute the GRM.
Results for the three traits under analysis are reported: a) yield; b) lodging; c) starch content. Accuracies obtained with the A+I, G+I and A+G+I, models are represented in blue, green and red, respectively. The color of the dots show if each rGP was significantly lower than the highest observed rGP obtained with each model.
Fig 5
Fig 5. Gain in rGP obtained by using a GWAS based marker selection instead than a random one.
Results for the three traits under analysis are reported: a) yield; b) lodging; c) starch content. Stars showing the significance of the improvement are displayed.

Similar articles

Cited by

References

    1. Bernardo R. Quantitative traits in plants [Internet]. II edition Stemma press; 2010. Available: http://stemmapress.com
    1. Xu Y, Crouch JH. Marker-assisted selection in plant breeding: From publications to practice. Crop Sci. 2008;48: 391–407.
    1. Bernardo R. Molecular markers and selection for complex traits in plants: Learning from the last 20 years. Crop Sci. 2008;48: 1649–1664.
    1. Heffner EL, Sorrells ME, Jannink J-L. Genomic Selection for Crop Improvement. Crop Sci. 2009;49: 1.
    1. Jannink J- L, Lorenz AJ, Iwata H. Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics. 2010;9: 166–177. 10.1093/bfgp/elq001 - DOI - PubMed

LinkOut - more resources