Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 25;3(1):e12660.
doi: 10.1002/emp2.12660. eCollection 2022 Feb.

Pediatric sepsis phenotypes for enhanced therapeutics: An application of clustering to electronic health records

Affiliations

Pediatric sepsis phenotypes for enhanced therapeutics: An application of clustering to electronic health records

Ioannis Koutroulis et al. J Am Coll Emerg Physicians Open. .

Abstract

Objective: The heterogeneity of pediatric sepsis patients suggests the potential benefits of clustering analytics to derive phenotypes with distinct host response patterns that may help guide personalized therapeutics. We evaluate the relative performance of latent class analysis (LCA) and K-means, 2 commonly used clustering methods toward the derivation of clinically useful pediatric sepsis phenotypes.

Methods: Data were extracted from anonymized medical records of 6446 pediatric patients that presented to 1 of 6 emergency departments (EDs) between 2013 and 2018 and were thereafter admitted. Using International Classification of Diseases (ICD)-9 and ICD-10 discharge codes, 151 patients were identified with a sepsis continuum diagnosis that included septicemia, sepsis, severe sepsis, and septic shock. Using feature sets used in related clustering studies, LCA and K-means algorithms were used to derive 4 distinct phenotypic pediatric sepsis segmentations. Each segmentation was evaluated for phenotypic homogeneity, separation, and clinical use.

Results: Using the 2 feature sets, LCA clustering resulted in 2 similar segmentations of 4 clinically distinct phenotypes, while K-means clustering resulted in segmentations of 3 and 4 phenotypes. All 4 segmentations identified at least 1 high severity phenotype, but LCA-identified phenotypes reflected superior stratification, high entropy approaching 1 (eg, 0.994) indicating excellent separation between estimated phenotypes, and differential treatment/treatment response, and outcomes that were non-randomly distributed across phenotypes (P < 0.001).

Conclusion: Compared to K-means, which is commonly used in clustering studies, LCA appears to be a more robust, clinically useful statistical tool in analyzing a heterogeneous pediatric sepsis cohort toward informing targeted therapies. Additional prospective studies are needed to validate clinical utility of predictive models that target derived pediatric sepsis phenotypes in emergency department settings.

Keywords: K‐means; LCA; phenotypes; sepsis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE 1
FIGURE 1
Distribution of Williams features across phenotypes. Red, phenotype 1 (critical severity); green, phenotype 2 (least severe); blue, phenotype 3 (moderate severity) and purple, phenotype 4 (high severity)
FIGURE 2
FIGURE 2
Relationship between power and sample size for LCA
FIGURE 3
FIGURE 3
Relationship between power and sample size for K‐means

Similar articles

Cited by

References

    1. Ross MK, Wei W, Ohno‐Machado L. “Big data” and the electronic health record. Yearb Med Inform. 2014;9(1):97. - PMC - PubMed
    1. Yan S, Kwan YH, Tan CS, Thumboo J, Low LL. A systematic review of the clinical application of data‐driven population segmentation analysis. BMC Med Res Methodol. 2018;18(1):121. - PMC - PubMed
    1. Lanza ST, Rhoades BL. Latent class analysis: an alternative perspective on subgroup analysis in prevention and treatment. Prev Sci. 2013;14(2):157‐168. - PMC - PubMed
    1. Oberski D. Mixture Models: Latent Profile and Latent Class Analysis. In: Robertson J., Kaptein M. (eds) Modern Statistical Methods for HCI. Human Computer Interaction Series. Springer, Cham. 2016. 10.1007/978-3-319-26633-6_12 - DOI
    1. Boehmke B, Greenwell B. Hands‐On Machine Learning with R (1st ed.). Chapman and Hall/CRC. (2019). 10.1201/9780367816377 - DOI