Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Apr;190(4):1447-60.
doi: 10.1534/genetics.111.137570. Epub 2012 Jan 31.

Inferring coancestry in population samples in the presence of linkage disequilibrium

Affiliations

Inferring coancestry in population samples in the presence of linkage disequilibrium

M D Brown et al. Genetics. 2012 Apr.

Abstract

In both pedigree linkage studies and in population-based association studies there has been much interest in the use of modern dense genetic marker data to infer segments of gene identity by descent (ibd) among individuals not known to be related, to increase power and resolution in localizing genes affecting complex traits. In this article, we present a hidden Markov model (HMM) for ibd among a set of chromosomes and describe methods and software for inference of ibd among the four chromosomes of pairs of individuals, using either phased (haplotypic) or unphased (genotypic) data. The model allows for missing data and typing error, but does not model linkage disequilibrium (LD), because fitting an accurate LD model requires large samples from well-studied populations. However, LD remains a major confounding factor, since LD is itself a reflection of coancestry at the population level. To study the impact of LD, we have developed a novel simulation approach to generate realistic dense marker data for the same set of markers but at varying levels of LD. Using this approach, we present results of a study of the impact of LD on the sensitivity and specificity of our HMM model in estimating segments of ibd among sets of four chromosomes and between genotype pairs. We show that, despite not incorporating LD, our model has been quite successful in detecting segments as small as 10(6) bp (1 Mpb); we present also comparisons with fastIBD which uses an LD model in estimating ibd.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Histogram of the true lengths of the 10,603 segments in any state of ibd (excluding the no-ibd state) among the 500 pairs of individuals.
Figure 2
Figure 2
Linkage disequilibrium in a 5.16-Mbp segment of the chromosome, in the original set of 1917 chromosomes, and in sets of 1917 chromosomes generated by beaglesim, at attenuation levels γ = 0, 0.05, and 0.1.
Figure 3
Figure 3
Curves of −log10(r2) by distance between markers, fitted for each marker with each of the 50 markers to each side of it, for the original set of 1917 chromosomes, and in sets of 1917 chromosomes generated by beaglesim, at attenuation levels γ = 0, 0.05, 0.1, and in the absence of LD (γ = 1).
Figure 4
Figure 4
Example of the true ibd and the calls across the 140-Mbp chromosome, based on ibd_haplo output using haplotypic and genotypic data, for 1 of the 500 pairs of individuals. Middle: the true ibd state with dark shading shows any state of ibd and white the no-ibd state. Top: the inferred state using haplotypic data. Bottom: for the same data analyzed as a pair of genotypes. In the inferred results, the lighter shading represents a no-call.
Figure 5
Figure 5
Among the 10,603 segments in any state of ibd (see Figure 1), the proportion of markers that provided a call of the correct ibd state at a calling threshold of 0.9, by length of the segment. The four subfigures are for the values of γ shown; γ = 0 (high LD), γ = 0.05, γ = 0.1, and γ = 1.0 (no LD). The points and the solid fitted lines are for the genotypic data, while the dashed lines show the improvement obtainable using phased haplotypic data.

References

    1. Albrechtsen A., Korneliussen T. S., Moltke I., van Overseem Hansen T., Nielsen F. C., et al. , 2009. Relatedness mapping and tracts of relatedness for genome-wide data in the presence of linkage disequilibrium. Genet. Epidemiol. 33: 266–274 - PubMed
    1. Balding D. J., Nichols R. A., 1994. DNA profile match probability calculations: how to allow for population stratification, relatedness, database selection, and single bands. Forensic Sci. Int. 64: 125–140 - PubMed
    1. Baum L. E., Petrie T., Soules G., Weiss N., 1970. A maximization technique occurring in the statistical analysis of probabilistic functions on Markov chains. Ann. Math. Stat. 41: 164–171
    1. Browning B. L., Browning S. R., 2011. A fast powerful method for detecting identity by descent. Am. J. Hum. Genet. 88: 173–182 - PMC - PubMed
    1. Browning S. R., 2008. Estimation of pairwise identity by descent from dense genetic marker data in a population sample of haplotypes. Genetics 178: 2123–2132 - PMC - PubMed

Publication types