Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 19;12(4):e0176136.
doi: 10.1371/journal.pone.0176136. eCollection 2017.

Accurate phenotyping: Reconciling approaches through Bayesian model averaging

Affiliations

Accurate phenotyping: Reconciling approaches through Bayesian model averaging

Carla Chia-Ming Chen et al. PLoS One. .

Abstract

Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The overlapping of the traits for each of the true phenotypes.
Letters b, c, d, e, f, g and h correspond to the symptoms listed in Table 4 of Greenberg [16] (also in Table 1).
Fig 2
Fig 2. LOD scores of the phenotypes for each of the microsatellite markers across ten chromosomes.
P1, P2 and P3 indicate Phenotype 1, 2 and 3. The dotted line is the LOD score of Phenotype 1 estimated using MERLIN-qtl; the dashed-line is the LOD score of Phenotype 2 and the solid line is the LOD score of Phenotype 3. This is used as a benchmark for comparing the results of proposed methods.
Fig 3
Fig 3. The characteristics of clusters derived from different statistical models.
Plots on the left are deviance and posterior means of symptom prevalence in clusters of LCA and plots on the right are deviance and symptom prevalence in clusters of GoM.
Fig 4
Fig 4. LOD scores at each satellite marker for phenotype K1.
Fig 5
Fig 5. LOD scores at each satellite marker for phenotypes estimated after model averaging.
The black solid line shows the LOD scores obtained for K2 estimated using model averaging, the red dashed line shows the LOD scores of cluster 3 of LCA and the green dotted line is the LOD score using phenotype derived from GoM alone (cluster 3 of GoM).
Fig 6
Fig 6. Density of the estimated phenotypes K2.
The black solid line represents the distribution (over individuals) of the averaged phenotype weighted according to Laplace-Gibbs; dashed and dotted lines represent the distributions of the posterior mean of the phenotype predicted by LCA and GoM.

Similar articles

References

    1. Drewnowski A and Rock CL. The influence of genetic taste markers on food acceptance. Am J Clin Nutr 1995;62, 506–511 - PubMed
    1. Bierut LJ, Madden PAF, Breslau N, Johnson EO, Hatsukami D, Pomerleau, et al. Novel genes identified in a high-density genome wide association study for nicotine dependence. Hum Mol Genet 2007;16, 24–35. 10.1093/hmg/ddl441 - DOI - PMC - PubMed
    1. Hallmayer JF, Jablensky A, Michie P, Woodbury M, Salmon B, Combrinck J, et al. Linkage analysis of candidate regions using a composite neurocognitive phenotype correlated with schizophrenia. Mol Psychiatr 2003;8, 511–523. 10.1038/sj.mp.4001273 - DOI - PubMed
    1. Nyholt DR, Gillespie NG, Heath AC, Merikangas KR, Duffy DL, and Martin NG. Latent class and genetic analysis does not support migraine with aura and migraine without aura as separate entities. Genet. Epidemiol. 2004;26, 231–244. 10.1002/gepi.10311 - DOI - PubMed
    1. Corder EH and Woodbury MA. Genetic heterogeneity in Alzheimer’s disease: A grade of membership analysis. Genet Epidemiol 1993;10, 495–499. 10.1002/gepi.1370100628 - DOI - PubMed

LinkOut - more resources