Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Observational Study
. 2025 May 16;20(5):e0324017.
doi: 10.1371/journal.pone.0324017. eCollection 2025.

Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis

Affiliations
Observational Study

Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis

Nathaniel Hendrix et al. PLoS One. .

Abstract

Background: Post-COVID conditions (PCC) have proven difficult to diagnose. In this retrospective observational study, we aimed to characterize the level of variation in PCC diagnoses observed across clinicians from a number of methodological angles and to determine whether natural language classifiers trained on clinical notes can reconcile differences in diagnostic definitions.

Methods: We used data from 519 primary care clinics around the United States who were in the American Family Cohort registry between October 1, 2021 (when the ICD-10 code for PCC was activated) and November 1, 2023. There were 6,116 patients with a diagnostic code for PCC (U09.9), and 5,020 with diagnostic codes for both PCC and COVID-19. We explored these data using 4 different outcomes: 1) Time between COVID-19 and PCC diagnostic codes; 2) Count of patients with PCC diagnostic codes per clinician; 3) Patient-specific probability of PCC diagnostic code based on patient and clinician characteristics; and 4) Performance of a natural language classifier trained on notes from 5,000 patients annotated by two physicians to indicate probable PCC.

Results: Of patients with diagnostic codes for PCC and COVID-19, 61.3% were diagnosed with PCC less than 12 weeks after initial recorded COVID-19. Clinicians in the top 1% of diagnostic propensity accounted for more than a third of all PCC diagnoses (35.8%). Comparing LASSO logistic regressions predicting documentation of PCC diagnosis, a log-likelihood test showed significantly better fit when clinician and practice site indicators were included (p < 0.0001). Inter-rater agreement between physician annotators on PCC diagnosis was moderate (Cohen's kappa: 0.60), and performance of the natural language classifiers was marginal (best AUC: 0.724, 95% credible interval: 0.555-0.878).

Conclusion: We found evidence of substantial disagreement between clinicians on diagnostic criteria for PCC. The variation in diagnostic rates across clinicians points to the possibilities of under- and over-diagnosis for patients.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Data selection for the four included analyses.
Samples were not mutually exclusive.
Fig 2
Fig 2. Diagnoses made for each of the 973 (out of 3,845 total) clinicians with at least one PCC diagnosis made from October 1, 2021 to November 1, 2023.
Fig 3
Fig 3. Receiver operating characteristic (ROC) curve for two LASSO logistic regression models of documentation of PCC diagnosis.
The full model includes indicators for clinician and practice site, while the simple model excludes these.

Similar articles

References

    1. Ely EW, Brown LM, Fineberg HV, National Academies of Sciences, and Medicine Committee on Examining the Working Definition for Long Covid. Long covid defined. N Engl J Med. 2024;391(18):1746–53. doi: 10.1056/NEJMsb2408466 - DOI - PMC - PubMed
    1. Department of Health and Human Services, Office of the Assistant Secretary for Health. National research action plan on long COVID. 200 Independence Ave SW, Washington, DC 20201; 2022. Aug.
    1. Reese JT, Blau H, Casiraghi E, Bergquist T, Loomba JJ, Callahan TJ, et al.. Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes. EBioMedicine. 2023;87:104413. doi: 10.1016/j.ebiom.2022.104413 - DOI - PMC - PubMed
    1. Ioannou GN, Baraff A, Fox A, Shahoumian T, Hickok A, O’Hare AM, et al.. Rates and factors associated with documentation of diagnostic codes for long COVID in the national veterans affairs health care system. JAMA Netw Open. 2022;5(7):e2224359. doi: 10.1001/jamanetworkopen.2022.24359 - DOI - PMC - PubMed
    1. Zhang HG, Honerlaw JP, Maripuri M, Samayamuthu MJ, Beaulieu-Jones BR, Baig HS, et al.. Characterizing the use of the ICD-10 code for long COVID in 3 US healthcare systems. medRxiv; 2023. p. 2023.02.12.23285701. doi: 10.1101/2023.02.12.23285701 - DOI

Publication types