Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Sep;24(9):1550-7.
doi: 10.1101/gr.169375.113. Epub 2014 Jun 24.

MultiBLUP: improved SNP-based prediction for complex traits

Affiliations

MultiBLUP: improved SNP-based prediction for complex traits

Doug Speed et al. Genome Res. 2014 Sep.

Abstract

BLUP (best linear unbiased prediction) is widely used to predict complex traits in plant and animal breeding, and increasingly in human genetics. The BLUP mathematical model, which consists of a single random effect term, was adequate when kinships were measured from pedigrees. However, when genome-wide SNPs are used to measure kinships, the BLUP model implicitly assumes that all SNPs have the same effect-size distribution, which is a severe and unnecessary limitation. We propose MultiBLUP, which extends the BLUP model to include multiple random effects, allowing greatly improved prediction when the random effects correspond to classes of SNPs with distinct effect-size variances. The SNP classes can be specified in advance, for example, based on SNP functional annotations, and we also provide an adaptive procedure for determining a suitable partition of SNPs. We apply MultiBLUP to genome-wide association data from the Wellcome Trust Case Control Consortium (seven diseases), and from much larger studies of celiac disease and inflammatory bowel disease, finding that it consistently provides better prediction than alternative methods. Moreover, MultiBLUP is computationally very efficient; for the largest data set, which includes 12,678 individuals and 1.5 M SNPs, the total analysis can be run on a single desktop PC in less than a day and can be parallelized to run even faster. Tools to perform MultiBLUP are freely available in our software LDAK.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Prediction performance of BLUP and MultiBLUP on simulated quantitative traits. The two plots correspond to unrelated humans (left) and related mice (right). They show across 50 repetitions the correlation between predicted and observed phenotypes in the test set for BLUP (white boxes) and MultiBLUP (shaded boxes). The x-axis indexes the simulation scenarios, with increasing heterogeneity of effect sizes across the five regions. Here, MultiBLUP uses five GSMs, one for each region. Within each plot, the true (simulated) heritability is 0.5 (left half) or 0.8 (right half).

References

    1. The 1000 Genomes Project Consortium 2010. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 - PMC - PubMed
    1. Abraham G, Tye-Din J, Bhalala O, Kowalczyk A, Zobel J, Inouye M. 2014. Accurate and robust genomic prediction of celiac disease using statistical learning. PLoS Genet 10: e1004137. - PMC - PubMed
    1. Agura Y, Bonen D, Inohara N, Nicolae D, Chen F, Ramos R, Britton H, Moran T, Karaliuskasn R, Duerr R, et al. . 2001. A frameshift mutation in NOD2 associated with susceptibility to Crohn’s disease. Nature 411: 603–606 - PubMed
    1. Astle W, Balding D. 2009. Population structure and cryptic relatedness in genetic association studies. Stat Sci 24: 451–471
    1. Ballard D, Abraham C, Cho J, Zhao H. 2010. Pathway analysis comparison using Crohn’s disease genome wide association studies. BMC Med Genomics 3: 25. - PMC - PubMed

Publication types

LinkOut - more resources