Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep;107(3):425-35.
doi: 10.1016/j.cmpb.2010.12.016. Epub 2011 Jan 21.

Efficient data management in a large-scale epidemiology research project

Affiliations

Efficient data management in a large-scale epidemiology research project

Jens Meyer et al. Comput Methods Programs Biomed. 2012 Sep.

Abstract

This article describes the concept of a "Central Data Management" (CDM) and its implementation within the large-scale population-based medical research project "Personalized Medicine". The CDM can be summarized as a conjunction of data capturing, data integration, data storage, data refinement, and data transfer. A wide spectrum of reliable "Extract Transform Load" (ETL) software for automatic integration of data as well as "electronic Case Report Forms" (eCRFs) was developed, in order to integrate decentralized and heterogeneously captured data. Due to the high sensitivity of the captured data, high system resource availability, data privacy, data security and quality assurance are of utmost importance. A complex data model was developed and implemented using an Oracle database in high availability cluster mode in order to integrate different types of participant-related data. Intelligent data capturing and storage mechanisms are improving the quality of data. Data privacy is ensured by a multi-layered role/right system for access control and de-identification of identifying data. A well defined backup process prevents data loss. Over the period of one and a half year, the CDM has captured a wide variety of data in the magnitude of approximately 5terabytes without experiencing any critical incidents of system breakdown or loss of data. The aim of this article is to demonstrate one possible way of establishing a Central Data Management in large-scale medical and epidemiological studies.

PubMed Disclaimer

Publication types

MeSH terms

LinkOut - more resources