Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr;120(4):356-368.
doi: 10.1038/s41437-017-0023-4. Epub 2017 Dec 14.

A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction

Affiliations

A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction

Boby Mathew et al. Heredity (Edinb). 2018 Apr.

Abstract

Single nucleotide polymorphism (SNP)-heritability estimation is an important topic in several research fields, including animal, plant and human genetics, as well as in ecology. Linear mixed model estimation of SNP-heritability uses the structures of genomic relationships between individuals, which is constructed from genome-wide sets of SNP-markers that are generally weighted equally in their contributions. Proposed methods to handle dependence between SNPs include, "thinning" the marker set by linkage disequilibrium (LD)-pruning, the use of haplotype-tagging of SNPs, and LD-weighting of the SNP-contributions. For improved estimation, we propose a new conceptual framework for genomic relationship matrix, in which Mahalanobis distance-based LD-correction is used in a linear mixed model estimation of SNP-heritability. The superiority of the presented method is illustrated and compared to mixed-model analyses using a VanRaden genomic relationship matrix, a matrix used by GCTA and a matrix employing LD-weighting (as implemented in the LDAK software) in simulated (using real human, rice and cattle genotypes) and real (maize, rice and mice) datasets. Despite of the computational difficulties, our results suggest that by using the proposed method one can improve the accuracy of SNP-heritability estimates in datasets with high LD.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
In maize, the linkage disequilibrium estimates (R2) between pairs of marker loci plotted against the genetic distance. To estimate the R2 values we used 128 polymorphic markers selected from the second chromosome
Fig. 2
Fig. 2
The mean LD decay plot of 2165 SNP markers selected from the 12th chromosome of the rice dataset. Here, the X-axis represents the distance in base pairs, and the Y-axis corresponds to the mean of pair wise linkage disequilibrium estimate (R2) values from a bin length of 100
Fig. 3
Fig. 3
Low LD case: Box plots for the estimation error of heritability based on different approaches to calculate the genomic relationship matrix (GRM) using 100 simulation replicates with the human data. Here the Y-axis scale corresponds to the difference between the true simulated heritability and the estimated heritability values whereas X-axis corresponds to the different approaches to calculate the GRM
Fig. 4
Fig. 4
High LD case: Box plots for the estimation error of heritability based on different approaches to calculate the genomic relationship matrix (GRM) using 100 simulation replicates with the rice genotypes. Here the Y-axis scale corresponds to the difference between the true simulated heritability and the estimated heritability values whereas X-axis corresponds to the different approaches to calculate the GRM
Fig. 5
Fig. 5
Population structure plot for the maize, rice, mice and cattle datasets. This scatter plot presents the first two principal components (PC1 and PC2) with each point representing a single individual. In the rice dataset, the individuals that do not clearly belong to any of the three original populations (corners) are considered to represent the out group—admixed individuals, and the corresponding points in the plot are indicated with a red color
Fig. 6
Fig. 6
Histograms of the upper off-diagonal elements of the genomic relationship matrix calculated using VanRaden (2008) approach in maize, rice, mice, and cattle datasets

Similar articles

Cited by

References

    1. Ardlie KG, Kruglyak L, Seielstad M. Patterns of linkage disequilibrium in the human genome. Nat Rev Genet. 2002;3:299–309. doi: 10.1038/nrg777. - DOI - PubMed
    1. Bhatia G. et al. Haplotypes of common SNPs can explain missing heritability of complex diseases. Preprint at bioRxiv 10.1101/022418 (2016)
    1. Browning SR, Browning BL. Population structure can inflate SNP-based heritability estimates. Am J Hum Genet. 2011;89:191–193. doi: 10.1016/j.ajhg.2011.05.025. - DOI - PMC - PubMed
    1. Chen X, Min D, Yasir TA, Hu YG. Genetic diversity, population structure and linkage disequilibrium in elite chinese winter wheat investigated with SSR markers. PLoS ONE. 2012;7:e44510. doi: 10.1371/journal.pone.0044510. - DOI - PMC - PubMed
    1. Conti DV, Witte JS. Hierarchical modeling of linkage disequilibrum: genetic structure and spatial relations. Am J Hum Genet. 2003;72:351–363. doi: 10.1086/346117. - DOI - PMC - PubMed