. 2010 Jan 18;11 Suppl 1(Suppl 1):S45.

doi: 10.1186/1471-2105-11-S1-S45.

A novel approach for haplotype-based association analysis using family data

Yixuan Chen¹, Xin Li, Jing Li

Affiliations

PMID: 20122219
PMCID: PMC3009518
DOI: 10.1186/1471-2105-11-S1-S45

A novel approach for haplotype-based association analysis using family data

Yixuan Chen et al. BMC Bioinformatics. 2010.

. 2010 Jan 18;11 Suppl 1(Suppl 1):S45.

doi: 10.1186/1471-2105-11-S1-S45.

Authors

Yixuan Chen¹, Xin Li, Jing Li

Affiliation

¹ Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, OH 44106, USA. yixuan.chen@case.edu

PMID: 20122219
PMCID: PMC3009518
DOI: 10.1186/1471-2105-11-S1-S45

Abstract

Background: Haplotype-based approaches have been extensively studied for case-control association mapping in recent years. It has been shown that haplotype methods can provide more consistent results comparing to single-locus based approaches, especially in cases where causal variants are not typed. Improved power has been observed by clustering similar or rare haplotypes into groups to reduce the degrees of freedom of association tests. For family-based association studies, one commonly used strategy is Transmission Disequilibrium Tests (TDT), which examine the imbalanced transmission of alleles/haplotypes to affected and normal children. Many extensions have been developed to deal with general pedigrees and continuous traits.

Results: In this paper, we propose a new haplotype-based association method for family data that is different from the TDT framework. Our approach (termed F_HapMiner) is based on our previous successful experiences on haplotype inference from pedigree data and haplotype-based association mapping. It first infers diplotype pairs of each individual in each pedigree assuming no recombination within a family. A phenotype score is then defined for each founder haplotype. Finally, F_HapMiner applies a clustering algorithm on those founder haplotypes based on their similarities and identifies haplotype clusters that show significant associations with diseases/traits. We have performed extensive simulations based on realistic assumptions to evaluate the effectiveness of the proposed approach by considering different factors such as allele frequency, linkage disequilibrium (LD) structure, disease model and sample size. Comparisons with single-locus and haplotype-based TDT methods demonstrate that our approach consistently outperforms the TDT-based approaches regardless of disease models, local LD structures or allele/haplotype frequencies.

Conclusion: We present a novel haplotype-based association approach using family data. Experiment results demonstrate that it achieves significantly higher power than TDT-based approaches.

PubMed Disclaimer

Figures

**Figure 1**
**The computational framework of the proposed approach F_HapMiner**. (1) Infer haplotypes on all families based on the DSS algorithm. (2) Calculate the phenotype score for each founder haplotype from each family based on its occurrences in affected and normal members. (3) Cluster all the founder haplotypes using a position weighted haplotype similarity measure. (4) Evaluate the correlation between clusters and the trait using a statistical test.

**Figure 2**
**Illustration of the DSS algorithm**. a) The input pedigree with 8 members and their genotype data. b) Haplotypes are partially determined based on the Mendelian law and denoted as (paternal | maternal). In addition, one heterozygous SNP is fixed in each founder (an individual without parents in the pedigree). Five SNPs are left with freedom. c) Locus graphs for the two loci. Each graph has the same set of nodes as the original pedigree, where shaded (predetermined) nodes representing fixed SNPs in b). Each child is linked to its heterozygous parents, with edges labeled using h-variables. Node 5 is duplicated for easy process. d) Left: constraints on h-variables collected based on the two locus graphs using disjoint-set structures. Constraints are collected based on each pair of linked predetermined nodes (or duplicated nodes), and only no redundant ones are kept. Right: Solutions of h-variables can be obtained directly based on the disjoint-set structures. They are represented by free variables α. In this example, only one degree of freedom. e) Solutions of p variables derived from h variables in d).

**Figure 3**
**The effect of sliding window sizes**. The power of F_HapMiner on the single locus model using different sliding window sizes, grouped according to MAF (Table 2). The power may be adversely affected with large window sizes when haplotype blocks are short (the case of SNPs in the group with high MAF). Result is based on 50 pedigrees on the CF dataset using the penetrance set A.

**Figure 4**
**Power comparison on the single-locus model**. Power of F_HapMiner and the single-locus TDT on the CF dataset using three different penetrance models. Result is based on 50 pedigrees and grouped according to MAF. The letter inside the parentheses indicates the penetrance set.

**Figure 5**
**Mapping precision on the single-locus model**. Average distances in centimorgan from the predicted SNP to the true risk SNP for F_HapMiner and the single-locus TDT using the CF data. Result is based on 50 pedigrees using using the penetrance set A and grouped according to MAF.

**Figure 6**
**Power comparison on the rare haplotype model**. Power of F_HapMiner and the single-locus TDT on the CF dataset using three penetrance sets, grouped based on the number of risk haplotypes. The letter inside the parentheses indicates the penetrance set. Result is based on 100 pedigrees per replicate.

**Figure 7**
**Power comparison with the haplotype-based TDT**. Power of F_HapMiner, the single-locus TDT and the haplotype-based TDT on the CF dataset using the rare haplotype model with the penetrance set A for different sample sizes.

See this image and copyright information in PMC

Cited by

The follicular outcome after standard gonadotropin stimulation is associated with ERα and ERβ genotypes.
Lazaros L, Pamporaki C, Vlahos N, Takenaka A, Kitsou C, Kosmas I, Sofikitis N, Stefos T, Zikopoulos K, Hatzi E, Georgiou I. Lazaros L, et al. Endocrine. 2014 Dec;47(3):930-5. doi: 10.1007/s12020-014-0249-3. Epub 2014 Apr 5. Endocrine. 2014. PMID: 24705910
Single Marker and Haplotype-Based Association Analysis of Semolina and Pasta Colour in Elite Durum Wheat Breeding Lines Using a High-Density Consensus Map.
N'Diaye A, Haile JK, Cory AT, Clarke FR, Clarke JM, Knox RE, Pozniak CJ. N'Diaye A, et al. PLoS One. 2017 Jan 30;12(1):e0170941. doi: 10.1371/journal.pone.0170941. eCollection 2017. PLoS One. 2017. PMID: 28135299 Free PMC article.
Using haplotypes for the prediction of allelic identity to fine-map QTL: characterization and properties.
Jacquin L, Elsen JM, Gilbert H. Jacquin L, et al. Genet Sel Evol. 2014 Jul 14;46(1):45. doi: 10.1186/1297-9686-46-45. Genet Sel Evol. 2014. PMID: 25022866 Free PMC article.
Use of diplotypes - matched haplotype pairs from homologous chromosomes - in gene-disease association studies.
Zuo L, Wang K, Luo X. Zuo L, et al. Shanghai Arch Psychiatry. 2014 Jun;26(3):165-70. doi: 10.3969/j.issn.1002-0829.2014.03.009. Shanghai Arch Psychiatry. 2014. PMID: 25114493 Free PMC article.
Combining an evolution-guided clustering algorithm and haplotype-based LRT in family association studies.
Lee MH, Tzeng JY, Huang SY, Hsiao CK. Lee MH, et al. BMC Genet. 2011 May 19;12:48. doi: 10.1186/1471-2156-12-48. BMC Genet. 2011. PMID: 21592403 Free PMC article.

References

1. A haplotype map of the human genome. Nature. 2005;437(7063):1299–320. doi: 10.1038/nature04226. - DOI - PMC - PubMed
1. McPeek MS, Strahs A. Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am J Hum Genet. 1999;65(3):858–75. doi: 10.1086/302537. - DOI - PMC - PubMed
1. Toivonen HT, Onkamo P, Vasko K, Ollikainen V, Sevon P, Mannila H, Herr M, Kere J. Data mining applied to linkage disequilibrium mapping. Am J Hum Genet. 2000;67:133–45. doi: 10.1086/302954. - DOI - PMC - PubMed
1. Liu JS, Sabatti C, Teng J, Keats BJ, Risch N. Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res. 2001;11(10):1716–24. doi: 10.1101/gr.194801. - DOI - PMC - PubMed
1. Tzeng JY, Devlin B, Wasserman L, Roeder K. On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. Am J Hum Genet. 2003;72(4):891–902. doi: 10.1086/373881. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LM008991/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A novel approach for haplotype-based association analysis using family data

Affiliation

A novel approach for haplotype-based association analysis using family data

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials