Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Mar;76(3):449-62.
doi: 10.1086/428594. Epub 2005 Jan 31.

Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation

Affiliations

Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation

Matthew Stephens et al. Am J Hum Genet. 2005 Mar.

Abstract

Although many algorithms exist for estimating haplotypes from genotype data, none of them take full account of both the decay of linkage disequilibrium (LD) with distance and the order and spacing of genotyped markers. Here, we describe an algorithm that does take these factors into account, using a flexible model for the decay of LD with distance that can handle both "blocklike" and "nonblocklike" patterns of LD. We compare the accuracy of this approach with a range of other available algorithms in three ways: for reconstruction of randomly paired, molecularly determined male X chromosome haplotypes; for reconstruction of haplotypes obtained from trios in an autosomal region; and for estimation of missing genotypes in 50 autosomal genes that have been completely resequenced in 24 African Americans and 23 individuals of European descent. For the autosomal data sets, our new approach clearly outperforms the best available methods, whereas its accuracy in inferring the X chromosome haplotypes is only slightly superior. For estimation of missing genotypes, our method performed slightly better when the two subsamples were combined than when they were analyzed separately, which illustrates its robustness to population stratification. Our method is implemented in the software package PHASE (v2.1.1), available from the Stephens Lab Web site.

PubMed Disclaimer

Figures

Figure  1
Figure 1
Illustration of how π(hk+1|h1,…,hk) builds hk+1 as an imperfect mosaic of h1,…,hk. The figure illustrates the case k=3 and shows three possible values (h4A, h4B, and h4C) for h4, given h1, h2, and h3. Each column of circles represents a SNP locus, with blackened and unblackened circles representing the two alleles. Each possible h4 can be thought of as having been created by “copying” parts of h1, h2 and h3. The shading in each case shows which haplotype was “copied” at each position along the chromosome and indicates whether the haplotype is most closely related to h1, h2, or h3. Changes in the shading along a haplotype represent ancestral recombination events; these are more likely to occur between SNPs that are farther apart (e.g., SNPs 2 and 3 or 4 and 5) or that have a higher rate of recombination between them. The imperfect nature of the copying process is exemplified at the third and fourth locus, where h4B has the blackened allele despite having “copied” h2, which has the unblackened allele. The occurrence of several such “imperfections” on a single chunk indicate that the haplotype is relatively highly diverged from the one that it copied in that region (see text).

Similar articles

Cited by

References

Electronic-Database Information

    1. David Clayton Software page, http://www-gene.cimr.cam.ac.uk/clayton/software/
    1. HAP Webserver,http://www1.cs.columbia.edu/compbio/hap
    1. HUGO Gene Nomenclature Committee, http://www.gene.ucl.ac.uk/nomenclature/
    1. Stephens Lab Web site, http://www.stat.washington.edu/stephens/software.html (for PHASE)
    1. UW-FHCRC Variation Discovery Resource, http://pga.gs.washington.edu

References

    1. Besag JE (1994) Discussion on the paper by Grenander and Miller. J R Stat Soc B 56:591–592
    1. Carlson C, Eberle M, Rieder M, Smith J, Kruglyak L, Nickerson D (2003) Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat Genet 33:518–52110.1038/ng1128 - DOI - PubMed
    1. Chapman J, Cooper J, Todd J, Clayton D (2003) Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum Hered 56:18–3110.1159/000073729 - DOI - PubMed
    1. Clark AG (1990) Inference of haplotypes from PCR-amplified samples of diploid populations. Mol Biol Evol 7:111–122 - PubMed
    1. Crawford D, Bhangale T, Li N, Rieder M, Nickerson D, Stephens M (2004) Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat Genet 36:700–70610.1038/ng1376 - DOI - PubMed

Publication types