Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct 14:16:120.
doi: 10.1186/s12863-015-0278-9.

Improving accuracy of genomic prediction by genetic architecture based priors in a Bayesian model

Affiliations

Improving accuracy of genomic prediction by genetic architecture based priors in a Bayesian model

Ning Gao et al. BMC Genet. .

Abstract

Background: In recent years, with the development of high-throughput sequencing technology and the commercial availability of genotyping bead chips, more attention is being directed towards the utilization of abundant genetic markers in animal and plant breeding programs, human disease risk prediction and personal medicine. Several useful approaches to accomplish genomic prediction have been developed and used widely, but still have room for improvement to gain more accuracy. In this study, an improved Bayesian approach, termed BayesBπ, which differs from the original BayesB in priors assigning, is proposed. An effective method for calculating the locus-specific π by converting p-values from association between SNPs and traits' phenotypes is given and systemically validated using a German Holstein dairy cattle population. Furthermore, the new method is applied to a loblolly pine (Pinus taeda) dataset.

Results: Compared with the original BayesB, BayesBπ can improve the accuracy of genomic prediction up to 7.62% for milk fat percentage, a trait which shows a large effect of quantitative trait loci (QTL). For milk yield, which is controlled by small to moderate effect genes, the accuracy of genomic prediction can be improved up to 4.94%. For somatic cell score, of which no large effect QTL has been reported, GBLUP performs better than Bayesian methods. BayesBπ outperforms BayesCπ in 10 out of 12 scenarios in the dairy cattle population, especially in small to moderate population sizes where accuracy of BayesCπ are dramatically low. Results of the loblolly pine dataset show that BayesBπ outperforms BayesB in 14 out of 17 traits and BayesCπ in 8 out of 17 traits, respectively.

Conclusions: For traits controlled by large effect genes, BayesBπ can improve the accuracy of genomic prediction and unbiasedness of BayesB in moderate size populations. Knowledge of traits' genetic architectures can be integrated into practices of genomic prediction by assigning locus-specific priors to markers, which will help Bayesian approaches perform better in variable selection and marker effects shrinkage.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Distribution of p-values and locus-specific π of three traits in dairy cattle population across the genome. Rows in the figure correspond to distributions of features of milk fat percentage (FP), milk yield (MY), and somatic cell score (SCS), respectively. Four columns correspond to distributions of ω, density of ω, distribution of locus-specific π, and density of locus-specific π, respectively; where, ω = − log 10(p − values). The p-values are derived from ANOVA for all single markers. Logarithmic transformation of the p-values is performed for data visualization convenience and latter utilization. The locus-specific π is derived from the p-values of the ANOVA via formula (4). Since π is the proportion of non-effective markers, 1-π is taken as the probability of each marker to be effective. For milk yield and milk fat percentage, the clusters on chromosome 14 is the genomic segment where located the DGAT1 gene. For somatic cell score, no cluster is observed due to the lack of major genes. Distributions of the locus-specific π are consistent with our prior knowledge about the genetic architectures of these traits. These plots are drawn on the R software platform (http://www.r-project.org/)
Fig. 2
Fig. 2
Impact of population sizes on genomic prediction accuracy. Genomic prediction accuracies of each method in each subpopulation are averaged among three traits to test the overall performance of methods in different subpopulations. For example, accuracies of GBLUP in subpopulation 200 are averaged among three traits to gain its’ overall performance in this population size

References

    1. Vachon CM, van Gils CH, Sellers TA, Ghosh K, Pruthi S, Brandt KR, et al. Mammographic density, breast cancer risk and risk prediction. Breast Cancer Res. 2007, 9(6):doi:10.1186/bcr1829. - PMC - PubMed
    1. Jostins L, Barrett JC. Genetic risk prediction in complex disease. Hum Mol Genet. 2011;20:R182–R188. doi: 10.1093/hmg/ddr378. - DOI - PMC - PubMed
    1. Domchek SM, Eisen A, Calzone K, Stopfer J, Blackwood A, Weber BL. Application of breast cancer risk prediction models in clinical practice. J Clin Oncol. 2003;21(4):593–601. doi: 10.1200/JCO.2003.07.007. - DOI - PubMed
    1. Bonassi S, Au WW. Biomarkers in molecular epidemiology studies for health risk prediction. Mutat Res Rev Mutat Res. 2002;511(1):73–86. doi: 10.1016/S1383-5742(02)00003-0. - DOI - PubMed
    1. Bouquet A, Juga J. Integrating genomic selection into dairy cattle breeding programmes: a review. Animal. 2013;7(5):705–713. doi: 10.1017/S1751731112002248. - DOI - PubMed

Publication types