Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 7;9(12):1916.
doi: 10.3390/children9121916.

Improving Cohort-Hospital Matching Accuracy through Standardization and Validation of Participant Identifiable Information

Affiliations

Improving Cohort-Hospital Matching Accuracy through Standardization and Validation of Participant Identifiable Information

Yanhong Jessika Hu et al. Children (Basel). .

Abstract

Linking very large, consented birth cohorts to birthing hospitals clinical data could elucidate the lifecourse outcomes of health care and exposures during the pregnancy, birth and newborn periods. Unfortunately, cohort personally identifiable information (PII) often does not include unique identifier numbers, presenting matching challenges. To develop optimized cohort matching to birthing hospital clinical records, this pilot drew on a one-year (December 2020-December 2021) cohort for a single Australian birthing hospital participating in the whole-of-state Generation Victoria (GenV) study. For 1819 consented mother-baby pairs and 58 additional babies (whose mothers were not themselves participating), we tested the accuracy and effort of various approaches to matching. We selected demographic variables drawn from names, DOB, sex, telephone, address (and birth order for multiple births). After variable standardization and validation, accuracy rose from 10% to 99% using a deterministic-rule-based approach in 10 steps. Using cohort-specific modifications of the Australian Statistical Linkage Key (SLK-581), it took only 3 steps to reach 97% (SLK-5881) and 98% (SLK-5881.1) accuracy. We conclude that our SLK-5881 process could safely and efficiently achieve high accuracy at the population level for future birth cohort-birth hospital matching in the absence of unique identifier numbers.

Keywords: birth cohort; data accuracy; data linkage; demographics; hospital; hospital records; information retrieval; newborn; personally identifiable information; pregnant women.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Linkage matching process flow. Note: All matching and data retrieval were done by authorized data scientist from GenV, data initial matching was done by authorized hospital data analyst. FN = first name; LN = last name; DOB = date of birth; BO = birth order; TN = telephone; ADD = addresses.
Figure 2
Figure 2
Data standardisation and validation examples for telephone and address. Note: Phone number and address are mock-up examples.
Figure 3
Figure 3
Three linkage scenarios of hospital initial matching. Note: PII = personally identifiable information.
Figure 4
Figure 4
Accuracy rates for mother-baby pairs and babies without mothers’ PII information and their required steps for the three approaches. Note: details of included variables for each step listed in Supplemental Table S6. PII = personally identifiable information; SLK = Statistical Linkage key.

Similar articles

Cited by

References

    1. Cowie M.R., Blomster J.I., Curtis L.H., Duclaux S., Ford I., Fritz F., Goldman S., Janmohamed S., Kreuzer J., Leenay M., et al. Electronic health records to facilitate clinical research. Clin. Res. Cardiol. 2017;106:1–9. doi: 10.1007/s00392-016-1025-6. - DOI - PMC - PubMed
    1. Farmer R., Mathur R., Bhaskaran K., Eastwood S.V., Chaturvedi N., Smeeth L. Promises and pitfalls of electronic health record analysis. Diabetologia. 2018;61:1241–1248. doi: 10.1007/s00125-017-4518-6. - DOI - PMC - PubMed
    1. Colombo F., Oderkirk J., Slawomirski L. Handbook of Global Health. Springer; Berlin/Heidelberg, Germany: 2020. Health information systems, electronic medical records, and big data in global healthcare: Progress and challenges in oecd countries; pp. 1–31. Chapter 71-1.
    1. Harron K., Dibben C., Boyd J., Hjern A., Azimaee M., Barreto M.L., Goldstein H. Challenges in administrative data linkage for research. Big Data Soc. 2017;4:2053951717745678. doi: 10.1177/2053951717745678. - DOI - PMC - PubMed
    1. Smith M., Flack F. Data linkage in australia: The first 50 years. Int. J. Environ. Res. Public Health. 2021;18:11339. doi: 10.3390/ijerph182111339. - DOI - PMC - PubMed

LinkOut - more resources