Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan 1;56(1):114-125.
doi: 10.1016/j.csda.2011.06.014.

Comparison of Methods for Identifying Phenotype Subgroups Using Categorical Features Data With Application to Autism Spectrum Disorder

Affiliations

Comparison of Methods for Identifying Phenotype Subgroups Using Categorical Features Data With Application to Autism Spectrum Disorder

Mulugeta Gebregziabher et al. Comput Stat Data Anal. .

Abstract

We evaluate the performance of the Dirichlet process mixture (DPM) and the latent class model (LCM) in identifying autism phenotype subgroups based on categorical autism spectrum disorder (ASD) diagnostic features from the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition Text Revision. A simulation study is designed to mimic the diagnostic features in the ASD dataset in order to evaluate the LCM and DPM methods in this context. Likelihood based information criteria and DPM partitioning are used to identify the best fitting models. The Rand statistic is used to compare the performance of the methods in recovering simulated phenotype subgroups. Our results indicate excellent recovery of the simulated subgroup structure for both methods. The LCM performs slightly better than DPM when the correct number of latent subgroups is selected a priori. The DPM method utilizes a maximum a posteriori (MAP) criterion to estimate the number of classes, and yielded results in fair agreement with the LCM method. Comparison of model fit indices in identifying the best fitting LCM showed that adjusted Bayesian information criteria (ABIC) picks the correct number of classes over 90% of the time. Thus, when diagnostic features are categorical and there is some prior information regarding the number of latent classes, LCM in conjunction with ABIC is preferred.

PubMed Disclaimer

Conflict of interest statement

Disclosure: None of the authors disclosed any financial or other conflicts of interest.

Figures

Figure A.5
Figure A.5
Heat Map for the five class DPM. The level of dark color indicates the severity of the classes where (least affected is less dark and (highly affected is the darkest. The number of people assigned to the cluster is proportional to the size of the heat map box
Figure 1
Figure 1
Information criteria and deviance for the latent class analysis of the four and five class simulated datasets. Y-axis is value of the IC and x-axis is the number of estimated latent classes.
Figure 2
Figure 2
Distribution of the estimated number of classes by Information criteria for the four and five class simulated datasets. The X axis shows the counts of the estimated number of clusters as a function of the true simulated number of clusters (Y-axis) for different types of information criteria (legend).
Figure 3
Figure 3
Box and whisker plots of the Rand indices between the estimated and true (simulated) cluster partitions, computed for each LCM and DPM analysis across the four and five class simulated datasets. Higher values indicate better classification.
Figure 4
Figure 4
Heat Map for the five class LCM. The level of dark color indicates the severity of the classes where (least affected is less dark and (highly affected is the darkest. The number of people assigned to the cluster is proportional to the size of the heat map box

References

    1. Myhr G. Autism and other pervasive developmental disorders: exploring the dimensional view. Canadian Journal of Psychiatry. 1998;43:589–595. - PubMed
    1. Belinger LJ, Smith TH. A review of subtyping in autism and proposed dimensional classification mode. Journal of Autism and Developmental Disorders. 2001;31:411–422. - PubMed
    1. Stevens MC, Fein DA, Dunn M, Allen D, Waterhouse LH, Feinstein C, Rapin I. Subgroups of children with autism by cluster analysis: a longitudinal examination. Journal of the American Academy of Child and Adolescent Psychiatry. 2000;39:1–6. - PubMed
    1. Pickles A, Bolton P, Macdonald H, Bailey A, Le Couteur A, Sim CH, Rutter M. Latent-class analysis of recurrence risks for complex phenotypes with selection and measurement error: a twin and family history study of autism. Amer J Hum Genet. 1995;57:717–726. - PMC - PubMed
    1. King LB. Unpublished Dissertation. Medical University of South Carolina; 2009. Autism Spectrum Disorders in South Carolina: Healthcare Utilization, Phenotypes, and Environmental Exposures.

LinkOut - more resources