Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 6;7(11):3571-3586.
doi: 10.1534/g3.117.300076.

Genomic Prediction Within and Across Biparental Families: Means and Variances of Prediction Accuracy and Usefulness of Deterministic Equations

Affiliations

Genomic Prediction Within and Across Biparental Families: Means and Variances of Prediction Accuracy and Usefulness of Deterministic Equations

Pascal Schopp et al. G3 (Bethesda). .

Abstract

A major application of genomic prediction (GP) in plant breeding is the identification of superior inbred lines within families derived from biparental crosses. When models for various traits were trained within related or unrelated biparental families (BPFs), experimental studies found substantial variation in prediction accuracy (PA), but little is known about the underlying factors. We used SNP marker genotypes of inbred lines from either elite germplasm or landraces of maize (Zeamays L.) as parents to generate in silico 300 BPFs of doubled-haploid lines. We analyzed PA within each BPF for 50 simulated polygenic traits, using genomic best linear unbiased prediction (GBLUP) models trained with individuals from either full-sib (FSF), half-sib (HSF), or unrelated families (URF) for various sizes ([Formula: see text]) of the training set and different heritabilities ([Formula: see text] In addition, we modified two deterministic equations for forecasting PA to account for inbreeding and genetic variance unexplained by the training set. Averaged across traits, PA was high within FSF (0.41-0.97) with large variation only for [Formula: see text] and [Formula: see text] [Formula: see text] For HSF and URF, PA was on average ∼40-60% lower and varied substantially among different combinations of BPFs used for model training and prediction as well as different traits. As exemplified by HSF results, PA of across-family GP can be very low if causal variants not segregating in the training set account for a sizeable proportion of the genetic variance among predicted individuals. Deterministic equations accurately forecast the PA expected over many traits, yet cannot capture trait-specific deviations. We conclude that model training within BPFs generally yields stable PA, whereas a high level of uncertainty is encountered in across-family GP. Our study shows the extent of variation in PA that must be at least reckoned with in practice and offers a starting point for the design of training sets composed of multiple BPFs.

Keywords: GBLUP; GenPred; Genomic Selection; Shared Data Resources; biparental families; deterministic accuracy; genomic prediction; linkage disequilibrium; plant breeding.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Boxplots of empirical prediction accuracies ρABT in BPFs of DH lines, and (B) variance components of different factors influencing the variation of ρABRT. Parents of BPFs were sampled from ancestral population Elite, and SNP markers were used to calculate the genomic relationship matrix G. Results are shown for different pedigree relationships (FSF, HSF, and URF) between the predicted family (BPFpred) A and training family (BPFtrain) B, as well as for different sample sizes Ntrain and heritabilities h2.
Figure 2
Figure 2
(A) Boxplots of empirical prediction accuracies ρABT in BPFs of DH lines and (B) variance components of different factors influencing the variation of ρABRT. Parents of BPFs were sampled from ancestral population Elite (left) or Landrace (right), and either genotypes at SNP markers or at QTL were used to calculate the genomic relationship matrix G. Results are shown for different pedigree relationships (FSF, HSF, and URF) between the predicted family (BPFpred) A and training family (BPFtrain) B and refer to Ntrain=100 and h2=0.6.
Figure 3
Figure 3
Empirical prediction accuracy ρ in BPFs of DH lines plotted against deterministic prediction accuracies ρW and ρD. The top two graphs refer to observations for single traits (ρAT for FSF and ρABT otherwise), and the bottom row to means over traits (ρA¯ for FSF and ρAB¯ otherwise). Parents of BPFs were sampled from ancestral population Elite and genotypes at SNP markers were used to calculate the genomic relationship matrix G. Results are shown for a random sample of 10,000 data points, Ntrain=100 and h2=0.6.
Figure 4
Figure 4
(A) Chromosome segment substitution effects (CSSEA,W in red and CSSEB,W in blue) and correlation between local TBVs and local GEBVs in the predicted family A (green) averaged in sliding windows W (see Materials and Methods for definition). GEBVs were calculated from QTL effects estimated by RR-BLUP in training set (HSF) B. Results are shown for Ntrain=250 and two traits T1 and T2 with h2=1 and large differences in prediction accuracy ρ. Both traits were generated from the same set of 1000 QTL with θAB0.40, but different QTL effects. (B) Correlation between local TBVs and local GEBVs (green lines) shown together with true QTL effects (diamonds) and estimated QTL effects (circles) for T1 and T2 in B on chromosome 5. Colors indicate QTL segregating in both A and B (orange) or only in A (purple); grey bars in the background reflect the windows W.

References

    1. Akdemir D., Sanchez J. I., Jannink J.-L., 2015. Optimization of genomic selection training populations with a genetic algorithm. Genet. Sel. Evol. 47: 38. - PMC - PubMed
    1. Albrecht T., Wimmer V., Auinger H., Erbe M., Knaak C., et al. , 2011. Genome-based prediction of testcross values in maize. Theor. Appl. Genet. 123: 339–350. - PubMed
    1. Astle W., Balding D., 2009. Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 24: 451–471.
    1. Bates D., Mächler M., Bolker B., Walker S., 2015. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67: 1–48.
    1. Bernardo R., Yu J., 2007. Prospects for genomewide selection for quantitative traits in maize. Crop Sci. 47: 1082–1090.

Publication types

LinkOut - more resources