Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Aug 12:5:842306.
doi: 10.3389/frai.2022.842306. eCollection 2022.

Phenotype clustering in health care: A narrative review for clinicians

Affiliations
Review

Phenotype clustering in health care: A narrative review for clinicians

Tyler J Loftus et al. Front Artif Intell. .

Abstract

Human pathophysiology is occasionally too complex for unaided hypothetical-deductive reasoning and the isolated application of additive or linear statistical methods. Clustering algorithms use input data patterns and distributions to form groups of similar patients or diseases that share distinct properties. Although clinicians frequently perform tasks that may be enhanced by clustering, few receive formal training and clinician-centered literature in clustering is sparse. To add value to clinical care and research, optimal clustering practices require a thorough understanding of how to process and optimize data, select features, weigh strengths and weaknesses of different clustering methods, select the optimal clustering method, and apply clustering methods to solve problems. These concepts and our suggestions for implementing them are described in this narrative review of published literature. All clustering methods share the weakness of finding potential clusters even when natural clusters do not exist, underscoring the importance of applying data-driven techniques as well as clinical and statistical expertise to clustering analyses. When applied properly, patient and disease phenotype clustering can reveal obscured associations that can help clinicians understand disease pathophysiology, predict treatment response, and identify patients for clinical trial enrollment.

Keywords: artificial intelligence; cluster; endotype; endotyping; machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Phenotype clustering in health care applies clustering algorithms to clinical data, biomarkers, or genomic data to form unique groupings that can elucidate pathophysiology, predict treatment response, or augment clinical trial enrollment.
Figure 2
Figure 2
Similarity of elements in clustering algorithms is inversely proportional to distance. This is often derived by applying the Pythagorean theorem to calculate Euclidean distance. We illustrate this approach in two-dimensional space, though similar calculations apply for data points of arbitrary dimensionality.

References

    1. Abraham E., Laterre P., Garg R., Levy H., Talwar D., Trzaskoma B. L., et al. . (2005). Drotrecogin alfa (activated) for adults with severe sepsis and a low risk of death. N. Engl. J. Med. 353, 1332–1341. 10.1056/NEJMoa050935 - DOI - PubMed
    1. Alhasoun F., Aleissa F., Alhazzani M., Moyano L. G., Pinhanez C., Gonzalez M. C. (2018). Age density patterns in patients medical conditions: a clustering approach. PLoS Comput. Biol. 14, e1006115. 10.1371/journal.pcbi.1006115 - DOI - PMC - PubMed
    1. Altman N., Krzywinski M. (2017). Clustering. Nat. Methods 14, 545–546. 10.1038/nmeth.4299 - DOI - PMC - PubMed
    1. Ankerst M. (1999). OPTICS: ordering points to identify the clustering structure. SIGMOD Rec. 28, 49–60. 10.1145/304181.304187 - DOI
    1. Ankerst M., Breunig M. M., Kriegel H.-P., Sander J. (1999). OPTICS: ordering points to identify the clustering structure. ACM Sigmod Rec. 28, 49–60.

LinkOut - more resources