Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 1;21(1):95.
doi: 10.1186/s12874-021-01285-y.

Intra-database validation of case-identifying algorithms using reconstituted electronic health records from healthcare claims data

Affiliations

Intra-database validation of case-identifying algorithms using reconstituted electronic health records from healthcare claims data

Nicolas H Thurin et al. BMC Med Res Methodol. .

Abstract

Background: Diagnosis performances of case-identifying algorithms developed in healthcare database are usually assessed by comparing identified cases with an external data source. When this is not feasible, intra-database validation can present an appropriate alternative.

Objectives: To illustrate through two practical examples how to perform intra-database validations of case-identifying algorithms using reconstituted Electronic Health Records (rEHRs).

Methods: Patients with 1) multiple sclerosis (MS) relapses and 2) metastatic castration-resistant prostate cancer (mCRPC) were identified in the French nationwide healthcare database (SNDS) using two case-identifying algorithms. A validation study was then conducted to estimate diagnostic performances of these algorithms through the calculation of their positive predictive value (PPV) and negative predictive value (NPV). To that end, anonymized rEHRs were generated based on the overall information captured in the SNDS over time (e.g. procedure, hospital stays, drug dispensing, medical visits) for a random selection of patients identified as cases or non-cases according to the predefined algorithms. For each disease, an independent validation committee reviewed the rEHRs of 100 cases and 100 non-cases in order to adjudicate on the status of the selected patients (true case/ true non-case), blinded with respect to the result of the corresponding algorithm.

Results: Algorithm for relapses identification in MS showed a 95% PPV and 100% NPV. Algorithm for mCRPC identification showed a 97% PPV and 99% NPV.

Conclusion: The use of rEHRs to conduct an intra-database validation appears to be a valuable tool to estimate the performances of a case-identifying algorithm and assess its validity, in the absence of alternative.

Keywords: Case-identifying algorithm; Claims database; Multiple sclerosis; Negative predictive value; Positive predictive value; Prostate Cancer; Reconstituted electronic health record; Validation study.

PubMed Disclaimer

Conflict of interest statement

M. G.-G. declares personal fees and non-financial support from Janssen, Sanofi, Astellas, Ipsen, Amgen and Pfizer.

M. S. and M. R. declare personal fees and non-financial support from Janssen, Sanofi, Astellas, Ipsen, Amgen, Ferring, and Astra-Zeneca.

E. M. declares personal fees and non-financial support from Biogen, Novartis, Roche, Merck, Sanofi-Genzyme.

B. B. declares personal fees and non-financial support from Biogen, Genzyme, Bayer, Medday, Actelion, Roche, Celgene, Novartis, Merck.

F. G. and M.D. declare personal fees and non-financial support from Biogen.

C. L. declares consulting or travel fees from Biogen, Novartis, Roche, Sanofi, Teva and Merck Serono, and research grant from Biogen.

O. H. declares personal fees and non-financial support from Biogen, Merck, Novartis, Roche, Genzyme.

All remaining authors have declared no conflicts of interest.

Figures

Fig. 1
Fig. 1
Generation of an anonymized reconstituted Electronic Health Record (rEHR) from data of the French Nationwide Healthcare database (SNDS)

References

    1. Ray WA. Improving automated database studies. Epidemiology. 2011;22(3):302–304. doi: 10.1097/EDE.0b013e31820f31e1. - DOI - PubMed
    1. Gavrielov-Yusim N, Friger M. Use of administrative medical databases in population-based research. J Epidemiol Community Health. 2014;68(3):283–287. doi: 10.1136/jech-2013-202744. - DOI - PubMed
    1. Strom BL. Pharmacoepidemiology. Wiley; 2019. What is Pharmacoepidemiology? pp. 1–26.
    1. Hennessy S. Use of health care databases in Pharmacoepidemiology. Basic Clin Pharmacol Toxicol. 2006;98(3):311–313. doi: 10.1111/j.1742-7843.2006.pto_368.x. - DOI - PubMed
    1. Hashimoto RE, Brodt ED, Skelly AC, Dettori JR. Administrative database studies: goldmine or goose chase? Evid-Based Spine-Care J. 2014;05(02):74–76. doi: 10.1055/s-0034-1390027. - DOI - PMC - PubMed

Publication types

LinkOut - more resources