Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 9;47(1):43.
doi: 10.1186/s12711-015-0117-5.

Sequence- vs. chip-assisted genomic selection: accurate biological information is advised

Affiliations

Sequence- vs. chip-assisted genomic selection: accurate biological information is advised

Miguel Pérez-Enciso et al. Genet Sel Evol. .

Abstract

Background: The development of next-generation sequencing technologies (NGS) has made the use of whole-genome sequence data for routine genetic evaluations possible, which has triggered a considerable interest in animal and plant breeding fields. Here, we investigated whether complete or partial sequence data can improve upon existing SNP (single nucleotide polymorphism) array-based selection strategies by simulation using a mixed coalescence - gene-dropping approach.

Results: We simulated 20 or 100 causal mutations (quantitative trait nucleotides, QTN) within 65 predefined 'gene' regions, each 10 kb long, within a genome composed of ten 3-Mb chromosomes. We compared prediction accuracy by cross-validation using a medium-density chip (7.5 k SNPs), a high-density (HD, 17 k) and sequence data (335 k). Genetic evaluation was based on a GBLUP method. The simulations showed: (1) a law of diminishing returns with increasing number of SNPs; (2) a modest effect of SNP ascertainment bias in arrays; (3) a small advantage of using whole-genome sequence data vs. HD arrays i.e. ~4%; (4) a minor effect of NGS errors except when imputation error rates are high (≥20%); and (5) if QTN were known, prediction accuracy approached 1. Since this is obviously unrealistic, we explored milder assumptions. We showed that, if all SNPs within causal genes were included in the prediction model, accuracy could also dramatically increase by ~40%. However, this criterion was highly sensitive to either misspecification (including wrong genes) or to the use of an incomplete gene list; in these cases, accuracy fell rapidly towards that reached when all SNPs from sequence data were blindly included in the model.

Conclusions: Our study shows that, unless an accurate prior estimate on the functionality of SNPs can be included in the predictor, there is a law of diminishing returns with increasing SNP density. As a result, use of whole-genome sequence data may not result in a highly increased selection response over high-density genotyping.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of additive and dominant variances and genetic effects. (A) Density of variances contributed by each QTN, both additive and dominant. (B) Density of genetic effects contributed by each QTN. Note the different scales in each graph.
Figure 2
Figure 2
Unfolded site frequency spectra. Site frequency spectra found for complete sequence (A) and for the high density array (B) when SNPs are ascertained in a panel of 50 individuals from the same population, setting a MAF > 0.15. Data shown correspond to data from the whole pedigree.
Figure 3
Figure 3
Distribution of accuracies across replicates with different strategies.
Figure 4
Figure 4
Use of biological prior information. Lines corresponds to accuracy using all QTN in the model (green), all SNPs within the 65 genes (red), all SNPs in 50% of the genes (blue), or SNPs in 50% of the genes and 30 random windows (magenta); in black, when all SNPs are included in the model. The number of QTN is 100.
Figure 5
Figure 5
Manhattan plot of SNP effect estimates. A) high-density chip, dashes represent causal loci positions; B) sequence results, black dots are the causal loci. Each chromosome is represented in a different shade of grey; the last three chromosomes do not contain any QTN. Effects are absolute values.
Figure 6
Figure 6
Distributions of SNP effect estimates. Distributions corresponding to intergenic SNPs (black), SNPs within genes (red) and QTN (green) obtained from sequence data analysis.
Figure 7
Figure 7
Accuracy with several error models. All data refer to a 100 QTN model and are the average of 100 replicates, chip refers to the high-density array. CHIP001: 10-3 genotyping error; CHIP0001: 10-4 genotyping error; SEQ05: sequence error λ = 0.05, imputation error γ = 0.001, minimum K number for an allele to be considered K = 1 (all SNPs are considered); SEQ05_K3, as previous model with K = 3; SEQ05_I20: as previous model with γ = 0.20. Table 1 describes the error models. Values represented are relative to accuracy obtained with full sequence without errors.
Figure 8
Figure 8
Accuracy across strategies with a near infinitesimal model. The results correspond to a 1000 QTN model with additive equal effects and no dominance.

Similar articles

Cited by

References

    1. Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME. Invited review: Genomic selection in dairy cattle: progress and challenges. J Dairy Sci. 2009;92:433–43. doi: 10.3168/jds.2008-1646. - DOI - PubMed
    1. Hutchison JL, Cole JB, Bickhart DM. Short communication: Use of young bulls in the United States. J Dairy Sci. 2014;97:3213–20. doi: 10.3168/jds.2013-7525. - DOI - PubMed
    1. Meuwissen T, Goddard M. Accurate prediction of genetic values for complex traits by whole-genome resequencing. Genetics. 2010;185:623–31. doi: 10.1534/genetics.110.116590. - DOI - PMC - PubMed
    1. Daetwyler HD, Capitan A, Pausch H, Stothard P, van Binsbergen R, Brøndum RF, et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat Genet. 2014;46:858–65. doi: 10.1038/ng.3034. - DOI - PubMed
    1. Druet T, Macleod IM, Hayes BJ. Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity (Edinb) 2014;112:39–47. doi: 10.1038/hdy.2013.13. - DOI - PMC - PubMed

Publication types