Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Oct;9(4):635-57.
doi: 10.1093/biostatistics/kxm055. Epub 2008 Mar 14.

A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers

Affiliations

A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers

Andrea S Foulkes et al. Biostatistics. 2008 Oct.

Abstract

This manuscript describes a novel, linear mixed-effects model-fitting technique for the setting in which correlated data indicators are not completely observed. Mixed modeling is a useful analytical tool for characterizing genotype-phenotype associations among multiple potentially informative genetic loci. This approach involves grouping individuals into genetic clusters, where individuals in the same cluster have similar or identical multilocus genotypes. In haplotype-based investigations of unrelated individuals, corresponding cluster assignments are unobservable since the alignment of alleles within chromosomal copies is not generally observed. We derive an expectation conditional maximization approach to estimation in the mixed modeling setting, where cluster assignments are ambiguous. The approach has broad relevance to the analysis of data with missing correlated data identifiers. An example is provided based on data arising from a cohort of human immunodeficiency virus type-1-infected individuals at risk for antiretroviral therapy-associated dyslipidemia.

Keywords: Expectation conditional maximization; Genotype; HIV-1; Haplotype; Lipids; Missing identifiers; Mixed-effects models; Phenotype; Population-based genetic association studies.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Sample approaches to defining clusters. For the 2 SNP example in which the observed genotypes are (AA, Aa,or aa) and (BB, Bb, or bb), there are 4 possible haplotypes, AB, Ab, aB, and ab, and 10 possible diplotypes. The most general approach to defining clusters results in 10 clusters consisting of all these possible combinations of 2 haplotypes. These are indicated by shaded rectangles. An alternative approach groups all diplotypes with at least one copy for the rare ab haplotype into a single cluster. This is indicated by the dashed rectangle that combines 4 of the previously defined clusters into a single cluster. In this case, there are a total of 7 clusters.
Fig. 2.
Fig. 2.
Empirical Bayes predictions of random EL cluster effects. Asterisk indicates cluster membership ambiguity.
Fig. 3.
Fig. 3.
Performance and sensitivity of the mixed modeling approach. (a) Power for detecting haplotype effect variability by number of clusters (n = 200). (b) Power under dominant and recessive founder models (n = 200). For recessive model, population frequencies of founder haplotype (Hd) equal to (1) 0.20 and (2) 0.40 are considered. For dominant model, a frequency of 0.20 is illustrated. (c) Power for mixed model and ANOVA approaches (n = 200, σbϵ = 0.40).

References

    1. Chiu WF, Yucel RM, Zanutto E, Zaslavsky AM. Using matched substitutes to improve imputations for geographically linked databases. Survey Methodology. 2005;31:69–72.
    1. Demidenko E. Mixed Models: Theory and Applications. Hoboken, NJ: John Wiley & Sons; 2004.
    1. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm (C/R: p22-37) Journal of the Royal Statistical Society, Series B, Methodological. 1977;39:1–22.
    1. Diggle P, Liang K-Y, Zeger SL. Analysis of Longitudinal Data. New York: Oxford University Press; 1994.
    1. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Molecular Biology and Evolution. 1995;12:921–927. - PubMed

MeSH terms