Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar 3;9(5):e014527.
doi: 10.1161/JAHA.119.014527. Epub 2020 Feb 26.

Impact of Different Electronic Cohort Definitions to Identify Patients With Atrial Fibrillation From the Electronic Medical Record

Affiliations

Impact of Different Electronic Cohort Definitions to Identify Patients With Atrial Fibrillation From the Electronic Medical Record

Rashmee U Shah et al. J Am Heart Assoc. .

Abstract

Background Electronic medical records (EMRs) allow identification of disease-specific patient populations, but varying electronic cohort definitions could result in different populations. We compared the characteristics of an electronic medical record-derived atrial fibrillation (AF) patient population using 5 different electronic cohort definitions. Methods and Results Adult patients with at least 1 AF billing code from January 1, 2010, to December 31, 2017, were included. Based on different electronic cohort definitions, we trained 5 different logistic regression models using a labeled training data set (n=786). Each model yielded a predicted probability; patients were classified as having AF if the probability was higher than a specified cut point. Test characteristics were calculated for each model. These models were then applied to the full cohort and resulting characteristics were compared. In the training set, the comprehensive model (including demographics, billing codes, and natural language processing results) performed best, with an area under the curve of 0.89, sensitivity of 0.90, and specificity of 0.87. Among a candidate population (n=22 000), the proportion of patients identified as having AF varied from 61% in the model using diagnosis or procedure International Classification of Diseases (ICD) billing codes to 83% in the model using natural language processing of clinical notes. Among identified AF patients, the proportion of patients with a CHA2DS2-VASc score ≥2 varied from 69% to 85%; oral anticoagulant treatment rates varied from 50% to 66% depending on the model. Conclusions Different electronic cohort definitions result in substantially different AF study samples. This difference threatens the quality and reproducibility of electronic medical record-based research and quality initiatives.

Keywords: atrial fibrillation; electronic health records; health services research; informatics; quality of care.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Receiver operating characteristic curves for different models to identify atrial fibrillation patients using the electronic medical record. In the training set (n=786), the AUC was highest for the comprehensive model and lowest for the Medicare model. AUC indicates area under the receiver operating characteristic curve; ICD, International Classification of Diseases; NLP, natural language processing; Sens, sensitivity; Spec, specificity.
Figure 2
Figure 2
Proportion of correct, false‐positive, and false‐negative classifications for each model in the training set. In the training set (n=786), the NLP model resulted in the highest number of correctly classified patients, at the expense of a high false‐positive rate. The outpatient billing codes and ECG method had the lowest number of correctly classified patients and the highest number of false negatives. AF indicates atrial fibrillation; ICD, International Classification of Diseases; NLP, natural language processing.
Figure 3
Figure 3
Proportion of patients included with CHA2DS2‐VASc score ≥2 and treated with an OAC for each model. When applied to the candidate population, different patient‐selection models resulted in populations with different sizes, stroke risks, and OAC treatment rates. The corresponding values are found in Table 2. “Outpatient AF codes, ECG” refers to the method used in prior publications from Kaiser Permanente. AF indicates atrial fibrillation; ICD, International Classification of Diseases; NLP, natural language processing; OAC, oral anticoagulant.

References

    1. Jensen PN, Johnson K, Floyd J, Heckbert SR, Carnahan R, Dublin S. A systematic review of validated methods for identifying atrial fibrillation using administrative data. Pharmacoepidemiol Drug Saf. 2012;21(suppl 1):141–147. - PMC - PubMed
    1. Piccini JP, Hammill BG, Sinner MF, Jensen PN, Hernandez AF, Heckbert SR, Benjamin EJ, Curtis LH. Incidence and prevalence of atrial fibrillation and associated mortality among Medicare beneficiaries, 1993–2007. Circ Cardiovasc Qual Outcomes. 2012;5:85–93. - PMC - PubMed
    1. Khurshid S, Keaney J, Ellinor PT, Lubitz SA. A simple and portable algorithm for identifying atrial fibrillation in the electronic medical record. Am J Cardiol. 2016;117:221–225. - PMC - PubMed
    1. Go AS, Hylek EM, Borowsky LH, Phillips KA, Selby JV, Singer DE. Warfarin use among ambulatory patients with nonvalvular atrial fibrillation: the anticoagulation and risk factors in atrial fibrillation (ATRIA) study. Ann Intern Med. 1999;131:927. - PubMed
    1. Molnar AO, van Walraven C, McArthur E, Fergusson D, Garg AX, Knoll G. Validation of administrative database codes for acute kidney injury in kidney transplant recipients. Can J Kidney Health Dis. 2016;3:1–10. - PMC - PubMed

Publication types