Probabilistic record linkage: relationships between file sizes, identifiers and match weights
- PMID: 11501632
Probabilistic record linkage: relationships between file sizes, identifiers and match weights
Abstract
This study investigates relationships between file sizes, amounts of information contained in commonly used record linkage variables, and the amount of information needed for a successful probabilistic linkage project. We present an equation predicting the amount of information needed for a successful linkage project. Match weights for variables commonly used in record linkage are measured using artificially created databases. Linkage algorithms were successful when the sum of minimum weights for variables used in a linkage exceeded the predicted cutoff. Linkage results were acceptable when this sum was near the predicted cutoff. This technique enables researchers to determine if enough information exists to perform a successful probabilistic linkage.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources