Assessment of health conditions from patient electronic health record portals vs self-reported questionnaires: an analysis of the INSPIRE study
- PMID: 40036551
- PMCID: PMC12012333
- DOI: 10.1093/jamia/ocaf027
Assessment of health conditions from patient electronic health record portals vs self-reported questionnaires: an analysis of the INSPIRE study
Abstract
Objectives: Direct electronic access to multiple electronic health record (EHR) systems through patient portals offers a novel avenue for decentralized research. Given the critical value of patient characterization, we sought to compare computable evaluation of health conditions from patient-portal EHR against the traditional self-report.
Materials and methods: In the nationwide Innovative Support for Patients with SARS-CoV-2 Infections Registry (INSPIRE) study, which linked self-reported questionnaires with multiplatform patient-portal EHR data, we compared self-reported health conditions across different clinical domains against computable definitions based on diagnosis codes, medications, vital signs, and laboratory testing. We assessed their concordance using Cohen's Kappa and the prognostic significance of differentially captured features as predictors of 1-year all-cause hospitalization risk.
Results: Among 1683 participants (mean age 41 ± 15 years, 67% female, 63% non-Hispanic Whites), the prevalence of conditions varied substantially between EHR and self-report (-13.2% to +11.6% across definitions). Compared with comprehensive EHR phenotypes, self-report under-captured all conditions, including hypertension (27.9% vs 16.2%), diabetes (10.1% vs 6.2%), and heart disease (8.5% vs 4.3%). However, diagnosis codes alone were insufficient. The risk for 1-year hospitalization was better defined by the same features from patient-portal EHR (area under the receiver operating curve [AUROC] 0.79) than from self-report (AUROC 0.68).
Discussion: EHR-derived computable phenotypes identified a higher prevalence of comorbidities than self-report, with prognostic value of additionally identified features. However, definitions based solely on diagnosis codes often undercaptured self-reported conditions, suggesting a role of broader EHR elements.
Conclusion: In this nationwide study, patient-portal-derived EHR data enabled extensive capture of patient characteristics across multiple EHR platforms, allowing better disease phenotyping compared with self-report.
Keywords: decentralized; multicenter; patient portal; pragmatic studies.
© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.
Conflict of interest statement
R.K. is an Associate Editor of JAMA. In addition to the funding listed above, he also receives research support, through Yale, from Bristol-Myers Squibb, Novo Nordisk, and BridgeBio. He is a coinventor of U.S. Pending Patent Applications WO2023230345A1, US20220336048A1, 63/346,610, 63/484,426, 63/508,315, 63/580,137, 63/606,203, 63/619,241, and 63/562,335, unrelated to current work. He is a co-founder of Ensight-AI, Inc. and Evidence2Health, health platforms to improve cardiovascular diagnosis and evidence-based cardiovascular care. M.S. is an Executive Associate Editor of JACC. K.N.O’L. receives funding from NIAID (R01AI166967; PI: Rodriguez, role: Co-I) and NIMH (R01MH130216). K.L.R. reports research grant funding from Abbott Diagnostics, DermTech, MeMed, Prenosis, Siemens Healthcare Diagnostics, PROCOVAXED funded by NIAID 1R01AI166967, and PREVENT funded by CDC U01CK00048 outside the submitted work. J.G.E. is Editor-in-chief of the Adult Primary Care topics at UpToDate. The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
References
-
- Neugebauer R, Ng S. Differential recall as a source of bias in epidemiologic research. J Clin Epidemiol. 1990;43:1337-1341. - PubMed
-
- Sauer CM, Chen L-C, Hyland SL, et al. Leveraging electronic health records for data science: common pitfalls and how to avoid them. Lancet Digit Health. 2022;4:e893-e898. - PubMed
Publication types
MeSH terms
Grants and funding
- R01 MH130216/MH/NIMH NIH HHS/United States
- R01HL167858/HL/NHLBI NIH HHS/United States
- R01MH130216/MH/NIMH NIH HHS/United States
- MeMed
- UL1TR002319/NH/NIH HHS/United States
- Siemens Healthcare Diagnostics
- R01AG089981/National Institute of Aging
- Abbott Diagnostics
- 1R01AI166967/AI/NIAID NIH HHS/United States
- Institute of Translational Health Sciences
- U01CK00048/CC/CDC HHS/United States
- CC/CDC HHS/United States
- University of Washington
- R01 AI166967/AI/NIAID NIH HHS/United States
- Prenosis
- R01 AG089981/AG/NIA NIH HHS/United States
- NH/NIH HHS/United States
- UL1 TR002319/TR/NCATS NIH HHS/United States
- R01AI166967/AI/NIAID NIH HHS/United States
- 2022060/DDCF/Doris Duke Charitable Foundation/United States
- DermTech
- K23 HL153775/HL/NHLBI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous
