Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 May 28;99(11):7335-9.
doi: 10.1073/pnas.102186799.

A dynamic programming algorithm for haplotype block partitioning

Affiliations

A dynamic programming algorithm for haplotype block partitioning

Kui Zhang et al. Proc Natl Acad Sci U S A. .

Abstract

We develop a dynamic programming algorithm for haplotype block partitioning to minimize the number of representative single nucleotide polymorphisms (SNPs) required to account for most of the common haplotypes in each block. Any measure of haplotype quality can be used in the algorithm and of course the measure should depend on the specific application. The dynamic programming algorithm is applied to analyze the chromosome 21 haplotype data of Patil et al. [Patil, N., Berno, A. J., Hinds, D. A., Barrett, W. A., Doshi, J. M., Hacker, C. R., Kautzer, C. R., Lee, D. H., Marjoribanks, C., McDonough, D. P., et al. (2001) Science 294, 1719-1723], who searched for blocks of limited haplotype diversity. Using the same criteria as in Patil et al., we identify a total of 3,582 representative SNPs and 2,575 blocks that are 21.5% and 37.7% smaller, respectively, than those identified using a greedy algorithm of Patil et al. We also apply the dynamic programming algorithm to the same data set based on haplotype diversity. A total of 3,982 representative SNPs and 1,884 blocks are identified to account for 95% of the haplotype diversity in each block.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The number of blocks versus the size of blocks, using 80% coverage. (a) The number of blocks as a function of the number of SNPs in a block using a window of two SNPs. The data points are drawn at the the center of the window, except that the last point is the number of blocks containing at least 25 SNPs. (b) The number of blocks as the function of the genomic length of the block, using a window of 2,000 kb. The data points are drawn at the center of the window except that the last point is the number of blocks spanning at least 25,000 kb along the chromosome.
Figure 2
Figure 2
The Q-Q plots for the uniform distribution and the positions of the block break points for four contigs, using 80% coverage for contigs (a) NT_002836, (b) NT_0 01035, (c) NT_00354.5, and (d) NT_002835.
Figure 3
Figure 3
The histograms for (a) the number of blocks, (b) the number of representative SNPs, (c) the size of the largest block, and (d) the number of blocks with more than ten SNPs for the 1,000 permuted samples. α = 80%.

References

    1. Kruglyak L, Nickerson D A. Nat Genet. 2001;27:234–236. - PubMed
    1. Abecasis G R, Noguchi E, Heinzmann A, Traherne J A, Bhattacharyya S, Leaves N I, Anderson G G, Zhang Y M, Lench N J, Carey A, et al. Am J Hum Genet. 2001;68:191–197. - PMC - PubMed
    1. Reich D E, Cargill M, Bolk S, Ireland J, Sabeti P C, Richter D J, Lavery T, Kouyoumjian R, Farhadian S F, Ward R, Lander E S. Nature (London) 2001;411:199–204. - PubMed
    1. Daly M J, Rioux J D, Schaffner S E, Hudson T J, Lander E S. Nat Genet. 2001;29:229–232. - PubMed
    1. Rioux J D, Daly M J, Silverberg M S, Lindblad K, Steinhart H, Cohen Z, Delmonte T, Kocher K, Miller K, Guschwan S, et al. Nat Genet. 2001;29:223–228. - PubMed

Publication types