Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun;19(e1):e54-9.
doi: 10.1136/amiajnl-2011-000335. Epub 2011 Sep 16.

Evaluation of record linkage between a large healthcare provider and the Utah Population Database

Affiliations

Evaluation of record linkage between a large healthcare provider and the Utah Population Database

Scott L DuVall et al. J Am Med Inform Assoc. 2012 Jun.

Abstract

Objective: Electronically linked datasets have become an important part of clinical research. Information from multiple sources can be used to identify comorbid conditions and patient outcomes, measure use of healthcare services, and enrich demographic and clinical variables of interest. Innovative approaches for creating research infrastructure beyond a traditional data system are necessary.

Materials and methods: Records from a large healthcare system's enterprise data warehouse (EDW) were linked to a statewide population database, and a master subject index was created. The authors evaluate the linkage, along with the impact of missing information in EDW records and the coverage of the population database. The makeup of the EDW and population database provides a subset of cancer records that exist in both resources, which allows a cancer-specific evaluation of the linkage.

Results: About 3.4 million records (60.8%) in the EDW were linked to the population database with a minimum accuracy of 96.3%. It was estimated that approximately 24.8% of target records were absent from the population database, which enabled the effect of the amount and type of information missing from a record on the linkage to be estimated. However, 99% of the records from the oncology data mart linked; they had fewer missing fields and this correlated positively with the number of patient visits.

Discussion and conclusion: A general-purpose research infrastructure was created which allows disease-specific cohorts to be identified. The usefulness of creating an index between institutions is that it allows each institution to maintain control and confidentiality of their own information.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None.

Figures

Figure 1
Figure 1
Proportion of all linked records and cancer type by pedigree quality.
Figure 2
Figure 2
Proportion of unlinked records with predicted probability of linkage greater than 90% by birth year.

References

    1. Goldacre M. Benefits of Linking Data: An International Perspective. Data Symposium, Link and Multiply: the Benefits of Data Linkage. The Sax Institute and the Centre of Health Record Linkage Management, Sydney, Australia, 27 July 2006
    1. Brooks JM, Chrischilles E, Scott S, et al. Information gained from linking SEER Cancer Registry Data to state-level hospital discharge abstracts. Surveillance, Epidemiology, and End Results. Med Care 2000;38:1131–40 - PubMed
    1. Coté TR, Manns A, Hardy CR, et al. Epidemiology of brain lymphoma among people with or without acquired immunodeficiency syndrome. AIDS/Cancer Study Group. J Natl Cancer Inst 1996;88:675–9 - PubMed
    1. Melnikow J, McGahan C, Sawaya GF, et al. Cervical intraepithelial neoplasia outcomes after treatment: long-term follow-up from the British Columbia Cohort Study. J Natl Cancer Inst 2009;101:721–8 - PMC - PubMed
    1. Travis LB, Fosså SD, Schonfeld SJ, et al. Second cancers among 40,576 testicular cancer patients: focus on long-term survivors. J Natl Cancer Inst 2005;97:1354–65 - PubMed

Publication types