. 2003 Apr;72(4):850-68.

doi: 10.1086/373966. Epub 2003 Mar 19.

Genome association studies of complex diseases by case-control designs

Ruzong Fan¹, Michael Knapp

Affiliations

PMID: 12647259
PMCID: PMC1180349
DOI: 10.1086/373966

Genome association studies of complex diseases by case-control designs

Ruzong Fan et al. Am J Hum Genet. 2003 Apr.

. 2003 Apr;72(4):850-68.

doi: 10.1086/373966. Epub 2003 Mar 19.

Authors

Ruzong Fan¹, Michael Knapp

Affiliation

¹ Department of Statistics, Texas A&M University, College Station, TX 77843, USA. rfan@stat.tamu.edu

PMID: 12647259
PMCID: PMC1180349
DOI: 10.1086/373966

Abstract

One way to perform linkage-disequilibrium (LD) mapping of genetic traits is to use single markers. Since dense marker maps-such as single-nucleotide polymorphism and high-resolution microsatellite maps-are available, it is natural and practical to generalize single-marker LD mapping to high-resolution haplotype or multiple-marker LD mapping. This article investigates high-resolution LD-mapping methods, for complex diseases, based on haplotype maps or microsatellite marker maps. The objective is to explore test statistics that combine information from haplotype blocks or multiple markers. Based on two coding methods, genotype coding and haplotype coding, Hotelling's T2 statistics TG and TH are proposed to test the association between a disease locus and two haplotype blocks or two markers. The validity of the two T2 statistics is proved by theoretical calculations. A statistic TC, an extension of the traditional chi2 method of comparing haplotype frequencies, is introduced by simply adding the chi2 test statistics of the two haplotype blocks together. The merit of the three methods is explored by calculation and comparison of power and of type I errors. In the presence of LD between the two blocks, the type I error of TC is higher than that of TH and TG, since TC ignores the correlation between the two blocks. For each of the three statistics, the power of using two haplotype blocks is higher than that of using only one haplotype block. By power comparison, we notice that TC has higher power than that of TH, and TH has higher power than that of TG. In the absence of LD between the two blocks, the power of TC is similar to that of TH and higher than that of TG. Hence, we advocate use of TH in the data analysis. In the presence of LD between the two blocks, TH takes into account the correlation between the two haplotype blocks and has a lower type I error and higher power than TG. Besides, the feasibility of the methods is shown by sample-size calculation.

PubMed Disclaimer

Figures

**Figure 1**
QQ plot at significance level α=0.01 using two haplotype blocks H₁, l=2, and H₂. In graphs I.1, I.2, and I.3, all parameters are the same as those of model I in table 2. In graphs II.1, II.2, and II.3, all parameters are the same as those of model II in table 2. In graphs III.1, III.2, and III.3, all parameters are the same as those of model III in table 2. In graphs IV.1, IV.2, and IV.3, all parameters are the same as those of model IV in table 2.

**Figure 2**
Power curves of T_C, T_H, T_G, T_C2, T_H2, and T_G2 at significance level α=0.01, using two haplotype blocks H₁, l=2, and H₂, r=2, when P(H₁₁)=P(H₁₂)=P(H₂₁)=P(H₂₂)=0.50, Δ_{H₁₁H₂₁}=P(H₁₁H₂₁)-P(H₁₁)P(H₂₁)=0.075, Δ_{H₁₁H₂₂}=P(H₁₁H₂₂)-P(H₁₁)P(H₂₂)=-0.075, P_D=0.30, N=M=500, T=50, for the four genetic models in table 4.

**Figure 3**
Power curves of T_C, T_H, T_G, T_C2, T_H2, and T_G2 at significance level α=0.01, using two haplotype blocks H₁, l=2, and H₂, r=3, when P(H₁₁)=P(H₁₂)=0.5, P(H₂₁)=0.4, P(H₂₂)=P(H₂₃)=0.30, Δ_{H₁₁H₂₁}=P(H₁₁H₂₁)-P(H₁₁)P(H₂₁)=0.075, Δ_{H₁₁H₂₂}=P(H₁₁H₂₂)-P(H₁₁)P(H₂₂)=-0.0375, Δ_{H₁₁H₂₃}=P(H₁₁H₂₃)-P(H₁₁)P(H₂₃)=-0.0375, P_D=0.30,N=M=500,T=50, for the four genetic models in table 4.

**Figure 4**
Power curves of T_C, T_H, T_G, T_C2, T_H2, and T_G2 at significance level α=0.01 using two haplotype blocks H₁, l=2, and H₂, r=4, when P(H₁₁)=P(H₁₂)=0.5,P(H₂₁)=P(H₂₂)=P(H₂₃)=P(H₂₄)=0.25, Δ_{H₁₁H₂₁}=P(H₁₁H₂₁)-P(H₁₁)P(H₂₁)=0.075, Δ_{H₁₁H₂₂}=P(H₁₁H₂₂)-P(H₁₁)P(H₂₂)=0.075, Δ_{H₁₁H₂₃}=P(H₁₁H₂₃)-P(H₁₁)P(H₂₃)=-0.075, Δ_{H₁₁H₂₄}=P(H₁₁H₂₄)-P(H₁₁)P(H₂₄)=-0.075, P_D=0.30, N=M=500, T=50, for the four genetic models in table 4.

**Figure 5**
Power curves of T_C, T_H, and T_G at significance level α=0.01 using two haplotype blocks H₁, l=2, and H₂, r=4, when P(H₁₁)=P(H₁₂)=0.5,P(H₂₁)=P(H₂₂)=P(H₂₃)=P(H₂₄)=0.25, Δ_{H₁₁H₂₁}=P(H₁₁H₂₁)-P(H₁₁)P(H₂₁)=0.0, Δ_{H₁₁H₂₂}=P(H₁₁H₂₂)-P(H₁₁)P(H₂₂)=0.0, Δ_{H₁₁H₂₃}=P(H₁₁H₂₃)-P(H₁₁)P(H₂₃)=0.0, Δ_{H₁₁H₂₄}=P(H₁₁H₂₄)-P(H₁₁)P(H₂₄)=0.0, P_D=0.30, N=M=500, T=50, for the four genetic models in table 4.

**Figure 6**
Power curves of T_H for different mutation ages at significance level α=0.01, using two haplotype blocks H₁, l=2, and H₂, r=4, when P(H₁₁)=P(H₁₂)=0.5, P(H₂₁)=P(H₂₂)=P(H₂₃)=P(H₂₄)=0.25, Δ_{H₁₁H₂₁}=P(H₁₁H₂₁)-P(H₁₁)P(H₂₁)=0.075, Δ_{H₁₁H₂₂}=P(H₁₁H₂₂)-P(H₁₁)P(H₂₂)=0.075, Δ_{H₁₁H₂₃}=P(H₁₁H₂₃)-P(H₁₁)P(H₂₃)=-0.075, Δ_{H₁₁H₂₄}=P(H₁₁H₂₄)-P(H₁₁)P(H₂₄)=-0.075, P_D=0.30, N=M=500, for the four genetic models in table 4.

**Figure 7**
Power curves of T_H for different disease frequency at significance level α=0.01, using two haplotype blocks H₁, l=2, and H₂, r=4, when P(H₁₁)=P(H₁₂)=0.5, P(H₂₁)=P(H₂₂)=P(H₂₃)=P(H₂₄)=0.25, Δ_{H₁₁H₂₁}=P(H₁₁H₂₁)-P(H₁₁)P(H₂₁)=0.075, Δ_{H₁₁H₂₂}=P(H₁₁H₂₂)-P(H₁₁)P(H₂₂)=0.075, Δ_{H₁₁H₂₃}=P(H₁₁H₂₃)-P(H₁₁)P(H₂₃)=-0.075, Δ_{H₁₁H₂₄}=P(H₁₁H₂₄)-P(H₁₁)P(H₂₄)=-0.075, T=50, N=M=500, for the four genetic models in table 4.

See this image and copyright information in PMC

References

Electronic-Database Information

1. R.F.'s Web site, http://stat.tamu.edu/~rfan/paper.html/case_control_Figs_supplement.pdf and http://stat.tamu.edu/~rfan/paper.html/case_control_powsim.pdf (for supplementary information)

References

1. Akey J, Jin L, Xiong MM (2001) Haplotype vs. single marker linkage disequilibrium tests: what do we gain? Eur J Hum Genet 9:291–300 - PubMed
1. Anderson TW (1984) An introduction to multivariate statistical analysis, 2nd edition. John Wiley and Sons, New York
1. Ardlie KG, Lunetta KL, Seielstad M (2002) Testing for population subdivision and association in four case-control studies. Am J Hum Genet 71:304–311 - PMC - PubMed
1. Broman KW, Murray JC, Sheffied VC, White RL, Weber JL (1998) Comprehensive human genetic map: individual and sex-specific variation in recombination. Am J Hum Genet 63:861–869 - PMC - PubMed
1. Chapman NH, Wijsman EM (1998) Genome screens using linkage disequilibrium tests: optimal marker characteristics and feasibility. Am J Hum Genet 63:1872–1885 - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genome association studies of complex diseases by case-control designs

Affiliation

Genome association studies of complex diseases by case-control designs

Authors

Affiliation

Abstract

Figures

References

Electronic-Database Information

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous