Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 13;38(24):5466-5468.
doi: 10.1093/bioinformatics/btac651.

Common data model for COVID-19 datasets

Affiliations

Common data model for COVID-19 datasets

Philipp Wegner et al. Bioinformatics. .

Abstract

Motivation: A global medical crisis like the coronavirus disease 2019 (COVID-19) pandemic requires interdisciplinary and highly collaborative research from all over the world. One of the key challenges for collaborative research is a lack of interoperability among various heterogeneous data sources. Interoperability, standardization and mapping of datasets are necessary for data analysis and applications in advanced algorithms such as developing personalized risk prediction modeling.

Results: To ensure the interoperability and compatibility among COVID-19 datasets, we present here a common data model (CDM) which has been built from 11 different COVID-19 datasets from various geographical locations. The current version of the CDM holds 4639 data variables related to COVID-19 such as basic patient information (age, biological sex and diagnosis) as well as disease-specific data variables, for example, Anosmia and Dyspnea. Each of the data variables in the data model is associated with specific data types, variable mappings, value ranges, data units and data encodings that could be used for standardizing any dataset. Moreover, the compatibility with established data standards like OMOP and FHIR makes the CDM a well-designed CDM for COVID-19 data interoperability.

Availability and implementation: The CDM is available in a public repo here: https://github.com/Fraunhofer-SCAI-Applied-Semantics/COVID-19-Global-Model.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Screenshot from CDM showing various meta-data associated with a variable ‘Bilirubin’

References

    1. Capnetz Stiftung.
    1. IBM Explorys Solutions.
    1. Jakob C.E.M. et al. (2021) First results of the “lean European open survey on SARS-CoV-2-Infected patients (LEOSS)”. Infection, 49, 63–73. - PMC - PubMed
    1. Johnson A. et al. (2016) MIMIC-III, a freely accessible critical care database. Sci. Data, 3, 160035. 10.1038/sdata.2016.35. - DOI - PMC - PubMed
    1. Kurth F. et al. (2020) Studying the pathophysiology of coronavirus disease 2019: a protocol for the Berlin prospective COVID-19 patient cohort (Pa-COVID-19). Infection, 48, 619–626. - PMC - PubMed

Publication types