Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 28;8(1):e10362.
doi: 10.1002/lrh2.10362. eCollection 2024 Jan.

Automated generation of comparator patients in the electronic medical record

Affiliations

Automated generation of comparator patients in the electronic medical record

Joseph Rigdon et al. Learn Health Syst. .

Abstract

Background: Well-designed randomized trials provide high-quality clinical evidence but are not always feasible or ethical. In their absence, the electronic medical record (EMR) presents a platform to conduct comparative effectiveness research, central to the emerging academic learning health system (aLHS) model. A barrier to realizing this vision is the lack of a process to efficiently generate a reference comparison group for each patient.

Objective: To test a multi-step process for the selection of comparators in the EMR.

Materials and methods: We conducted a mixed-methods study within a large aLHS in North Carolina. We (1) created a list of 35 candidate variables; (2) surveyed 270 researchers to assess the importance of candidate variables; and (3) built consensus rankings around survey-identified variables (ie, importance scores >7) across two panels of 7-8 clinical research experts. Prioritized algorithm inputs were collected from the EMR and applied using a greedy matching technique. Feasibility was measured as the percentage of patients with 100 matched comparators and performance was measured via computational time and Euclidean distance.

Results: Nine variables were selected: age, sex, race, ethnicity, body mass index, insurance status, smoking status, Charlson Comorbidity Index, and neighborhood percentage in poverty. The final process successfully generated 100 matched comparators for each of 1.8 million candidate patients, executed in less than 100 min for the majority of strata, and had average Euclidean distance 0.043.

Conclusion: EMR-derived matching is feasible to implement across a diverse patient population and can provide a reproducible, efficient source of comparator data for observational studies, with additional testing in clinical research applications needed.

Keywords: comparative effectiveness research; electronic medical record; matching; retrospective observational study.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE 1
FIGURE 1
Computational time in minutes by the number of patients in each bucket.
FIGURE 2
FIGURE 2
Distribution of distances between reference patient (origin), matched comparators (orange), and the rest of the population (gray). Reference patient is a white, female, smoker, Medicare, and aged 60–65, the most common category.
FIGURE 3
FIGURE 3
Distribution of distances between reference patient (origin), matched comparators (orange), and the rest of the population (gray). Reference patient is a white, male, non‐smoker, private insurance, and age <18, a category with high distance values for matches.

References

    1. Burde H. THE HITECH ACT: an overview. AMA J Ethics. 2011;13(3):172‐175. doi:10.1001/virtualmentor.2011.13.3.hlaw1-1103 - DOI - PubMed
    1. Easterling D, Perry AC, Woodside R, Patel T, Gesell SB. Clarifying the concept of a learning health system for healthcare delivery organizations: implications from a qualitative analysis of the scientific literature. Learn Health Syst. n/a(n/a). 2022;e10287. doi:10.1002/lrh2.10287 - DOI - PMC - PubMed
    1. Rubin DB. Using multivariate matched sampling and regression adjustment to control bias in observational studies. J Am Stat Assoc. 1979;74(366a):318‐328. doi:10.1080/01621459.1979.10482513 - DOI
    1. Austin PC. An Introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399‐424. doi:10.1080/00273171.2011.568786 - DOI - PMC - PubMed
    1. Rigdon J, Baiocchi M, Basu S. Preventing false discovery of heterogeneous treatment effect subgroups in randomized trials. Trials. 2018;19(1):382. doi:10.1186/s13063-018-2774-5 - DOI - PMC - PubMed

LinkOut - more resources