Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Jul;59(7):1305-1318.
doi: 10.1080/02770903.2021.1923738. Epub 2021 May 18.

Asthma clustering methods: a literature-informed application to the children's health study data

Affiliations
Review

Asthma clustering methods: a literature-informed application to the children's health study data

Mindy K Ross et al. J Asthma. 2022 Jul.

Abstract

Objective: The heterogeneity of asthma has inspired widespread application of statistical clustering algorithms to a variety of datasets for identification of potentially clinically meaningful phenotypes. There has not been a standardized data analysis approach for asthma clustering, which can affect reproducibility and clinical translation of results. Our objective was to identify common and effective data analysis practices in the asthma clustering literature and apply them to data from a Southern California population-based cohort of schoolchildren with asthma.

Methods: As of January 1, 2020, we reviewed key statistical elements of 77 asthma clustering studies. Guided by the literature, we used 12 input variables and three clustering methods (hierarchical clustering, k-medoids, and latent class analysis) to identify clusters in 598 schoolchildren with asthma from the Southern California Children's Health Study (CHS).

Results: Clusters of children identified by latent class analysis were characterized by exhaled nitric oxide, FEV1/FVC, FEV1 percent predicted, asthma control and allergy score; and were predictive of control at two year follow up. Clusters from the other two methods were less clinically remarkable, primarily differentiated by sex and race/ethnicity and less predictive of asthma control over time.

Conclusion: Upon review of the asthma phenotyping literature, common approaches of data clustering emerged. When applying these elements to the Children's Health Study data, latent class analysis clusters-represented by exhaled nitric oxide and spirometry measures-had clinical relevance over time.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interest: The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.

Figures

Figure 1.
Figure 1.
Article selection for the literature review, displayed using a PRISMA flow diagram
Figure 2.
Figure 2.
Characterization of the clusters resulting from each of three clustering methods (HCLUST: hierarchal clustering, k-medoids, and LCA: latent class analysis), using heatmaps to represent the cluster-specific mean values* of input variables. Darker values indicate larger means. Input variables (rows) are ordered by p-values for differences across clusters. Clusters (columns) are labeled with the number of participants in that cluster. * Complementary numerical summaries are presented in Table E4. † Input variables: Male is a binary indicator for sex (male). For simple visual presentation, here we use a binary indicator of Hispanic rather than the nominal 3-level race/ethnicity variable (Hispanic, non-Hispanic White, Other) input to the clustering algorithms. BMI Category is the BMI percentile category, SHS is secondhand smoke, and % pred FEV1 is percent predicted FEV1.
Figure 3.
Figure 3.
Representation of the four clusters identified in the Children’s Health Study (CHS) by Latent Class Analysis (LCA), using a flow chart with decision points informed by a simplified classification (decision) tree. This simple flow chart did not exactly reproduce the clusters, but did yield an 81.7% classification accuracy within the 30% holdout test dataset.* * Modification of this flow chart to use decision points based on the clinically relevant values of FEV1/FVC >= 85, FeNO <25, and FEV1 ≤80% predicted yielded a classification accuracy of 73.3% in the test dataset.

References

    1. Centers for Disease Control, National Center for Health Statistics. FastStats Asthma. 2015; https://www.cdc.gov/nchs/fastats/asthma.htm. Accessed March 15, 2021.
    1. Centers for Disease Control. Uncontrolled Asthma Among Persons with Current Asthma. September 2014; https://www.cdc.gov/asthma/asthma_stats/uncontrolled_asthma.pdf. Accessed March 15, 2021.
    1. Barnes PJ, Jonsson B, Klim JB. The costs of asthma. Eur Respir J. 1996;9(4):636–642. - PubMed
    1. Asher I, Pearce N. Global burden of asthma among children. Int J Tuberc Lung Dis. 2014;18(11):1269–1278. - PubMed
    1. Wenzel SE, Schwartz LB, Langmack EL, et al. Evidence that severe asthma can be divided pathologically into two inflammatory subtypes with distinct physiologic and clinical characteristics. Am J Respir Crit Care Med. 1999;160(3):1001–1008. - PubMed