Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan:87:104413.
doi: 10.1016/j.ebiom.2022.104413. Epub 2022 Dec 21.

Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes

Collaborators, Affiliations
Free PMC article

Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes

Justin T Reese et al. EBioMedicine. 2023 Jan.
Free PMC article

Abstract

Background: Stratification of patients with post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, long COVID is incompletely understood and characterised by a wide range of manifestations that are difficult to analyse computationally. Additionally, the generalisability of machine learning classification of COVID-19 clinical outcomes has rarely been tested.

Methods: We present a method for computationally modelling PASC phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Our approach defines a nonlinear similarity function that maps from a feature space of phenotypic abnormalities to a matrix of pairwise patient similarity that can be clustered using unsupervised machine learning.

Findings: We found six clusters of PASC patients, each with distinct profiles of phenotypic abnormalities, including clusters with distinct pulmonary, neuropsychiatric, and cardiovascular abnormalities, and a cluster associated with broad, severe manifestations and increased mortality. There was significant association of cluster membership with a range of pre-existing conditions and measures of severity during acute COVID-19. We assigned new patients from other healthcare centres to clusters by maximum semantic similarity to the original patients, and showed that the clusters were generalisable across different hospital systems. The increased mortality rate originally identified in one cluster was consistently observed in patients assigned to that cluster in other hospital systems.

Interpretation: Semantic phenotypic clustering provides a foundation for assigning patients to stratified subgroups for natural history or therapy studies on PASC.

Funding: NIH (TR002306/OT2HL161847-01/OD011883/HG010860), U.S.D.O.E. (DE-AC02-05CH11231), Donald A. Roux Family Fund at Jackson Laboratory, Marsico Family at CU Anschutz.

Keywords: COVID-19; Human Phenotype Ontology; Long COVID; Machine learning; Precision medicine; Semantic similarity.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests T. Bergquist received other support from Bill and Melinda Gates Foundation, H. Davis received support from Balvi Foundation and is a cofounder of Patient Led Research Collaborative. The other authors declare that they have no other competing interests.

Update of

References

    1. BMJ. 2020 Aug 11;370:m3026 - PubMed
    1. Diabetes Metab Syndr. 2021 Jan-Feb;15(1):337-342 - PubMed
    1. J Am Med Inform Assoc. 2015 May;22(3):553-64 - PubMed
    1. Elife. 2022 Feb 08;11: - PubMed
    1. Sci Rep. 2021 Aug 11;11(1):16290 - PubMed

Grants and funding