Electronic medical records for discovery research in rheumatoid arthritis
- PMID: 20235204
- PMCID: PMC3121049
- DOI: 10.1002/acr.20184
Electronic medical records for discovery research in rheumatoid arthritis
Abstract
Objective: Electronic medical records (EMRs) are a rich data source for discovery research but are underutilized due to the difficulty of extracting highly accurate clinical data. We assessed whether a classification algorithm incorporating narrative EMR data (typed physician notes) more accurately classifies subjects with rheumatoid arthritis (RA) compared with an algorithm using codified EMR data alone.
Methods: Subjects with > or =1 International Classification of Diseases, Ninth Revision RA code (714.xx) or who had anti-cyclic citrullinated peptide (anti-CCP) checked in the EMR of 2 large academic centers were included in an "RA Mart" (n = 29,432). For all 29,432 subjects, we extracted narrative (using natural language processing) and codified RA clinical information. In a training set of 96 RA and 404 non-RA cases from the RA Mart classified by medical record review, we used narrative and codified data to develop classification algorithms using logistic regression. These algorithms were applied to the entire RA Mart. We calculated and compared the positive predictive value (PPV) of these algorithms by reviewing the records of an additional 400 subjects classified as having RA by the algorithms.
Results: A complete algorithm (narrative and codified data) classified RA subjects with a significantly higher PPV of 94% than an algorithm with codified data alone (PPV of 88%). Characteristics of the RA cohort identified by the complete algorithm were comparable to existing RA cohorts (80% women, 63% anti-CCP positive, and 59% positive for erosions).
Conclusion: We demonstrate the ability to utilize complete EMR data to define an RA cohort with a PPV of 94%, which was superior to an algorithm using codified data alone.
Figures

Similar articles
-
Impact of ICD10 and secular changes on electronic medical record rheumatoid arthritis algorithms.Rheumatology (Oxford). 2020 Dec 1;59(12):3759-3766. doi: 10.1093/rheumatology/keaa198. Rheumatology (Oxford). 2020. PMID: 32413107 Free PMC article.
-
Quantifying and improving rheumatoid arthritis algorithm performance in biobank settings.Semin Arthritis Rheum. 2025 Jun;72:152668. doi: 10.1016/j.semarthrit.2025.152668. Epub 2025 Feb 22. Semin Arthritis Rheum. 2025. PMID: 40024070
-
Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts.PLoS One. 2015 Aug 24;10(8):e0136651. doi: 10.1371/journal.pone.0136651. eCollection 2015. PLoS One. 2015. PMID: 26301417 Free PMC article.
-
Anti-cyclic citrullinated peptide revised criteria for the classification of rheumatoid arthritis.Ann Rheum Dis. 2008 Nov;67(11):1557-61. doi: 10.1136/ard.2007.082339. Epub 2008 Jan 30. Ann Rheum Dis. 2008. PMID: 18234714 Free PMC article.
-
[Autoantibodies in the diagnosis of rheumatoid arthritis. Utility of anti-cyclic citrullinated peptides].Med Clin (Barc). 2003 Nov 8;121(16):619-24. doi: 10.1016/s0025-7753(03)74035-4. Med Clin (Barc). 2003. PMID: 14636538 Review. Spanish.
Cited by
-
Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification.J Am Med Inform Assoc. 2013 Sep-Oct;20(5):882-6. doi: 10.1136/amiajnl-2012-001350. Epub 2012 Oct 16. J Am Med Inform Assoc. 2013. PMID: 23077130 Free PMC article.
-
Chapter 13: Mining electronic health records in the genomics era.PLoS Comput Biol. 2012;8(12):e1002823. doi: 10.1371/journal.pcbi.1002823. Epub 2012 Dec 27. PLoS Comput Biol. 2012. PMID: 23300414 Free PMC article.
-
Developing Electronic Health Record Algorithms That Accurately Identify Patients With Systemic Lupus Erythematosus.Arthritis Care Res (Hoboken). 2017 May;69(5):687-693. doi: 10.1002/acr.22989. Epub 2017 Apr 10. Arthritis Care Res (Hoboken). 2017. PMID: 27390187 Free PMC article.
-
Coronary Microvascular Dysfunction in Rheumatoid Arthritis Compared to Diabetes Mellitus and Association With All-Cause Mortality.Arthritis Care Res (Hoboken). 2021 Feb;73(2):159-165. doi: 10.1002/acr.24108. Arthritis Care Res (Hoboken). 2021. PMID: 31705724 Free PMC article.
-
Using machine learning to identify health outcomes from electronic health record data.Curr Epidemiol Rep. 2018 Dec;5(4):331-342. doi: 10.1007/s40471-018-0165-9. Epub 2018 Sep 20. Curr Epidemiol Rep. 2018. PMID: 30555773 Free PMC article.
References
-
- Trivedi B. Biomedical science: betting the bank. Nature. 2008;452(7190):926–9. - PubMed
-
- Gabriel SE. The sensitivity and specificity of computerized databases for the diagnosis of rheumatoid arthritis. Arthritis Rheum. 1994;37(6):821–3. - PubMed
-
- Singh JA, Holmgren AR, Noorbaloochi S. Accuracy of Veterans Administration databases for a diagnosis of rheumatoid arthritis. Arthritis Rheum. 2004;51(6):952–7. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R01-AR057108/AR/NIAMS NIH HHS/United States
- UL1-RR02578-01/RR/NCRR NIH HHS/United States
- T32-AR055885/AR/NIAMS NIH HHS/United States
- K08-AR-055688-01A1/AR/NIAMS NIH HHS/United States
- T32 AR055885/AR/NIAMS NIH HHS/United States
- U54LM008748/LM/NLM NIH HHS/United States
- R01-LM009966/LM/NLM NIH HHS/United States
- R21-NR0101710-01/NR/NINR NIH HHS/United States
- U54-LM00878/LM/NLM NIH HHS/United States
- R01-DK075837/DK/NIDDK NIH HHS/United States
- R01-AR049880/AR/NIAMS NIH HHS/United States
- K24-AR0524-01/AR/NIAMS NIH HHS/United States
- P60 AR047782/AR/NIAMS NIH HHS/United States
- U01 GM092691/GM/NIGMS NIH HHS/United States
- K24 AR052403/AR/NIAMS NIH HHS/United States
- R21 NS067463/NS/NINDS NIH HHS/United States
- R01 LM007222/LM/NLM NIH HHS/United States
- R01 DK075837/DK/NIDDK NIH HHS/United States
- U54 LM008748/LM/NLM NIH HHS/United States
- R01-LM007222/LM/NLM NIH HHS/United States
- R01-HL091495-01A1/HL/NHLBI NIH HHS/United States
- R01-AR056768/AR/NIAMS NIH HHS/United States
- R01 AR049880/AR/NIAMS NIH HHS/United States
- U54-LM008748/LM/NLM NIH HHS/United States
- R01 HL091495/HL/NHLBI NIH HHS/United States
- R21-NS067463/NS/NINDS NIH HHS/United States
- R01 LM009966/LM/NLM NIH HHS/United States
- K08 AR055688/AR/NIAMS NIH HHS/United States
- P60-AR047782/AR/NIAMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical