Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;194(3):597-607.
doi: 10.1534/genetics.113.152207. Epub 2013 May 2.

Genomic BLUP decoded: a look into the black box of genomic prediction

Affiliations

Genomic BLUP decoded: a look into the black box of genomic prediction

David Habier et al. Genetics. 2013 Jul.

Abstract

Genomic best linear unbiased prediction (BLUP) is a statistical method that uses relationships between individuals calculated from single-nucleotide polymorphisms (SNPs) to capture relationships at quantitative trait loci (QTL). We show that genomic BLUP exploits not only linkage disequilibrium (LD) and additive-genetic relationships, but also cosegregation to capture relationships at QTL. Simulations were used to study the contributions of those types of information to accuracy of genomic estimated breeding values (GEBVs), their persistence over generations without retraining, and their effect on the correlation of GEBVs within families. We show that accuracy of GEBVs based on additive-genetic relationships can decline with increasing training data size and speculate that modeling polygenic effects via pedigree relationships jointly with genomic breeding values using Bayesian methods may prevent that decline. Cosegregation information from half sibs contributes little to accuracy of GEBVs in current dairy cattle breeding schemes but from full sibs it contributes considerably to accuracy within family in corn breeding. Cosegregation information also declines with increasing training data size, and its persistence over generations is lower than that of LD, suggesting the need to model LD and cosegregation explicitly. The correlation between GEBVs within families depends largely on additive-genetic relationship information, which is determined by the effective number of SNPs and training data size. As genomic BLUP cannot capture short-range LD information well, we recommend Bayesian methods with t-distributed priors.

Keywords: GenPred; Shared data resources; additive-genetic relationships; cosegregation; genomic best linear unbiased prediction (BLUP); genomic selection; linkage disequilibrium (LD).

PubMed Disclaimer

Figures

Figure 1
Figure 1
Top-cross design showing one of the families used in cross-validations with training hybrids descending from the first generation of doubled haploids and validation hybrids descending from the next generation of doubled haploids. Both training and validation hybrids come from the same inbred tester, ITester.
Figure 2
Figure 2
Linkage disequilibrium between QTL and SNPs measured as r2 against map distance in centimorgans for the two scenarios long-range and short-range LD and using formulas by Ohta and Kimura (1969) with an effective population size of 1500.
Figure 3
Figure 3
Accuracy of GEBVs obtained by genomic BLUP, long-range LD, and 47,831 SNPs for the four information designs and accuracy of pedigree index using 1001 training individuals structured into 143 half-sib families and a heritability of 0.5. Each validation individual had 7 half sibs in training in all designs but LD only. The number of replicates was 300.
Figure 4
Figure 4
(A and B) Accuracy of GEBVs and standard errors obtained by genomic BLUP and 47,831 SNPs for the four information designs according to training data size and extent of LD and accuracy of pedigree index using a heritability of 0.5. Training data were structured into half-sib families of size seven, and each validation individual had seven half sibs in training in all designs but LD only. The numbers of replicates for training data sizes 98, 1001, and 1995 were 750, 300, and 75, respectively.
Figure 5
Figure 5
Intraclass correlations and standard errors for GEBVs within half-sib families obtained by genomic BLUP, long-range LD, and 47,831 SNPs according to training data size and information design. Training data were structured into half-sib families of size seven, and each validation individual had seven half sibs in training in all designs but LD only. The numbers of replicates for training data sizes 98, 1001, and 1995 were 750, 300, and 75, respectively.
Figure 6
Figure 6
(A and B) Accuracy of GEBVs and standard errors within and across families obtained by genomic BLUP, long-range LD, and 39,991 SNPs for the four information designs using 450 and 1800 training hybrids structured into families of size 30. Each validation hybrid had 30 related hybrids in training in all designs but LD only. The numbers of replicates for training data sizes 450 and 1800 were 1000 and 200, respectively.
Figure 7
Figure 7
Accuracy of GEBVs and standard errors within families obtained by genomic BLUP, long-range LD, and 39,991 SNPs for validation hybrids of the same (current) and following (next) generation and for the designs LD only, RS + CS, and RS + CS + LD, using 450 training hybrids. Each validation hybrid had 30 related hybrids in training in all designs but LD only. The number of replicates was 1000.

Similar articles

Cited by

References

    1. Albrecht T., Wimmer V., Auinger H.-J., Erbe M., Knaak C., et al. , 2011. Genome-based prediction of testcross values in maize. Theor. Appl. Genet. 123: 339–350 - PubMed
    1. Andreescu C., Avendano S., Brown S. R., Hassen A., Lamont S. J., et al. , 2007. Linkage disequilibrium in related breeding lines of chickens. Genetics 177: 2161–2169 - PMC - PubMed
    1. Bastiaansen J., Coster A., Calus M., van Arendonk J., Bovenhuis H., 2012. Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures. Genet. Sel. Evol. 44: 3. - PMC - PubMed
    1. Bernardo R., 2010. Breeding for Quantitative Traits in Plants, Ed. 2 Stemma Press, Woodbury, Minnesota
    1. Calus M., Veerkamp R., 2007. Accuracy of breeding values when using and ignoring the polygenic effect in genomic breeding value estimation with a marker density of one SNP per cM. J. Anim. Breed. Genet. 124: 362–368 - PubMed

Publication types