Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 18;11 Suppl 1(Suppl 1):S45.
doi: 10.1186/1471-2105-11-S1-S45.

A novel approach for haplotype-based association analysis using family data

Affiliations

A novel approach for haplotype-based association analysis using family data

Yixuan Chen et al. BMC Bioinformatics. .

Abstract

Background: Haplotype-based approaches have been extensively studied for case-control association mapping in recent years. It has been shown that haplotype methods can provide more consistent results comparing to single-locus based approaches, especially in cases where causal variants are not typed. Improved power has been observed by clustering similar or rare haplotypes into groups to reduce the degrees of freedom of association tests. For family-based association studies, one commonly used strategy is Transmission Disequilibrium Tests (TDT), which examine the imbalanced transmission of alleles/haplotypes to affected and normal children. Many extensions have been developed to deal with general pedigrees and continuous traits.

Results: In this paper, we propose a new haplotype-based association method for family data that is different from the TDT framework. Our approach (termed F_HapMiner) is based on our previous successful experiences on haplotype inference from pedigree data and haplotype-based association mapping. It first infers diplotype pairs of each individual in each pedigree assuming no recombination within a family. A phenotype score is then defined for each founder haplotype. Finally, F_HapMiner applies a clustering algorithm on those founder haplotypes based on their similarities and identifies haplotype clusters that show significant associations with diseases/traits. We have performed extensive simulations based on realistic assumptions to evaluate the effectiveness of the proposed approach by considering different factors such as allele frequency, linkage disequilibrium (LD) structure, disease model and sample size. Comparisons with single-locus and haplotype-based TDT methods demonstrate that our approach consistently outperforms the TDT-based approaches regardless of disease models, local LD structures or allele/haplotype frequencies.

Conclusion: We present a novel haplotype-based association approach using family data. Experiment results demonstrate that it achieves significantly higher power than TDT-based approaches.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The computational framework of the proposed approach F_HapMiner. (1) Infer haplotypes on all families based on the DSS algorithm. (2) Calculate the phenotype score for each founder haplotype from each family based on its occurrences in affected and normal members. (3) Cluster all the founder haplotypes using a position weighted haplotype similarity measure. (4) Evaluate the correlation between clusters and the trait using a statistical test.
Figure 2
Figure 2
Illustration of the DSS algorithm. a) The input pedigree with 8 members and their genotype data. b) Haplotypes are partially determined based on the Mendelian law and denoted as (paternal | maternal). In addition, one heterozygous SNP is fixed in each founder (an individual without parents in the pedigree). Five SNPs are left with freedom. c) Locus graphs for the two loci. Each graph has the same set of nodes as the original pedigree, where shaded (predetermined) nodes representing fixed SNPs in b). Each child is linked to its heterozygous parents, with edges labeled using h-variables. Node 5 is duplicated for easy process. d) Left: constraints on h-variables collected based on the two locus graphs using disjoint-set structures. Constraints are collected based on each pair of linked predetermined nodes (or duplicated nodes), and only no redundant ones are kept. Right: Solutions of h-variables can be obtained directly based on the disjoint-set structures. They are represented by free variables α. In this example, only one degree of freedom. e) Solutions of p variables derived from h variables in d).
Figure 3
Figure 3
The effect of sliding window sizes. The power of F_HapMiner on the single locus model using different sliding window sizes, grouped according to MAF (Table 2). The power may be adversely affected with large window sizes when haplotype blocks are short (the case of SNPs in the group with high MAF). Result is based on 50 pedigrees on the CF dataset using the penetrance set A.
Figure 4
Figure 4
Power comparison on the single-locus model. Power of F_HapMiner and the single-locus TDT on the CF dataset using three different penetrance models. Result is based on 50 pedigrees and grouped according to MAF. The letter inside the parentheses indicates the penetrance set.
Figure 5
Figure 5
Mapping precision on the single-locus model. Average distances in centimorgan from the predicted SNP to the true risk SNP for F_HapMiner and the single-locus TDT using the CF data. Result is based on 50 pedigrees using using the penetrance set A and grouped according to MAF.
Figure 6
Figure 6
Power comparison on the rare haplotype model. Power of F_HapMiner and the single-locus TDT on the CF dataset using three penetrance sets, grouped based on the number of risk haplotypes. The letter inside the parentheses indicates the penetrance set. Result is based on 100 pedigrees per replicate.
Figure 7
Figure 7
Power comparison with the haplotype-based TDT. Power of F_HapMiner, the single-locus TDT and the haplotype-based TDT on the CF dataset using the rare haplotype model with the penetrance set A for different sample sizes.

Similar articles

Cited by

References

    1. A haplotype map of the human genome. Nature. 2005;437(7063):1299–320. doi: 10.1038/nature04226. - DOI - PMC - PubMed
    1. McPeek MS, Strahs A. Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am J Hum Genet. 1999;65(3):858–75. doi: 10.1086/302537. - DOI - PMC - PubMed
    1. Toivonen HT, Onkamo P, Vasko K, Ollikainen V, Sevon P, Mannila H, Herr M, Kere J. Data mining applied to linkage disequilibrium mapping. Am J Hum Genet. 2000;67:133–45. doi: 10.1086/302954. - DOI - PMC - PubMed
    1. Liu JS, Sabatti C, Teng J, Keats BJ, Risch N. Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res. 2001;11(10):1716–24. doi: 10.1101/gr.194801. - DOI - PMC - PubMed
    1. Tzeng JY, Devlin B, Wasserman L, Roeder K. On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. Am J Hum Genet. 2003;72(4):891–902. doi: 10.1086/373881. - DOI - PMC - PubMed

Publication types