Validation of a common data model for active safety surveillance research

J Marc Overhage¹, Patrick B Ryan, Christian G Reich, Abraham G Hartzema, Paul E Stang

Affiliations

PMID: 22037893
PMCID: PMC3240764
DOI: 10.1136/amiajnl-2011-000376

Validation of a common data model for active safety surveillance research

J Marc Overhage et al. J Am Med Inform Assoc. 2012 Jan-Feb.

. 2012 Jan-Feb;19(1):54-60.

doi: 10.1136/amiajnl-2011-000376. Epub 2011 Oct 28.

Authors

J Marc Overhage¹, Patrick B Ryan, Christian G Reich, Abraham G Hartzema, Paul E Stang

Affiliation

¹ Regenstrief Institute, Indiana University, School of Medicine, Indianapolis, Indiana, USA. moverhage@regenstrief.org

PMID: 22037893
PMCID: PMC3240764
DOI: 10.1136/amiajnl-2011-000376

Abstract

Objective: Systematic analysis of observational medical databases for active safety surveillance is hindered by the variation in data models and coding systems. Data analysts often find robust clinical data models difficult to understand and ill suited to support their analytic approaches. Further, some models do not facilitate the computations required for systematic analysis across many interventions and outcomes for large datasets. Translating the data from these idiosyncratic data models to a common data model (CDM) could facilitate both the analysts' understanding and the suitability for large-scale systematic analysis. In addition to facilitating analysis, a suitable CDM has to faithfully represent the source observational database. Before beginning to use the Observational Medical Outcomes Partnership (OMOP) CDM and a related dictionary of standardized terminologies for a study of large-scale systematic active safety surveillance, the authors validated the model's suitability for this use by example.

Validation by example: To validate the OMOP CDM, the model was instantiated into a relational database, data from 10 different observational healthcare databases were loaded into separate instances, a comprehensive array of analytic methods that operate on the data model was created, and these methods were executed against the databases to measure performance.

Conclusion: There was acceptable representation of the data from 10 observational databases in the OMOP CDM using the standardized terminologies selected, and a range of analytic methods was developed and executed with sufficient performance to be useful for active safety surveillance.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None.

Figures

**Figure 1**
Database characteristics. Rates of conditions, medications, procedures, and observations per person varied widely across databases. CCAE, Commercial Claims and Encounters; MDCD, MarketScan Medicaid Multi-State Database; MDCR, Medicare Supplemental and Coordination of Benefits Database; MSLR, MarketScan Lab Database.

**Figure 2**
Proportion of terms and database records for drugs (A) and medications (B) that could be mapped using different standard terminologies for the five commercial databases, demonstrating the suitability of the standardized terminologies chosen for the OMOP CDM. CCAE, Commercial Claims and Encounters; GPI, generic product identifier; ICD9, International Classification of Diseases, Ninth Revision; MDCD, MarketScan Medicaid Multi-State Database; MDCR, Medicare Supplemental and Coordination of Benefits Database; MSLR, MarketScan Lab Database; NDC, National Drug Code.

**Figure 3**
Graph showing the frequency, as a percentage of records, with which concepts that appear in a database at a rate more than three deviations from the mean frequency computed across all databases. Only two concepts (the RxNorm code for amlodipine 10 mg/benazepril 20 mg oral capsule and the Systematized Nomenclature of Medicine code for large liver) appeared in more than 0.10% of the records in a database. CCAE, Commercial Claims and Encounters; GPI, generic product identifier; ICD9, International Classification of Diseases, Ninth Revision; MDCD, MarketScan Medicaid Multi-State Database; MDCR, Medicare Supplemental and Coordination of Benefits Database; MSLR, MarketScan Lab Database; NDC, National Drug Code.

See this image and copyright information in PMC

References

1. Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med 2000;342:1878–86 - PubMed
1. Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med 2000;342:1887–92 - PMC - PubMed
1. Moses LE. Measuring effects without randomized trials? Options, problems, challenges. Med Care 1995;33(4 suppl):AS8–14 - PubMed
1. Kunz R, Oxman AD. The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials. BMJ 1998;317:1185–90 - PMC - PubMed
1. Stang PE, Ryan PB, Racoosin JA, et al. Advancing the science for active surveillance: rationale and design for the observational medical outcomes partnership. Ann Intern Med 2010;153:600–6 - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Validation of a common data model for active safety surveillance research

Affiliation

Validation of a common data model for active safety surveillance research

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous