Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 10:2016:1070-1079.
eCollection 2016.

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description

Affiliations

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description

Satya S Sahoo et al. AMIA Annu Symp Proc. .

Abstract

Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled "Rigor and Reproducibility " for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Core terms of the PROV Ontology and example provenance graph
Figure 2:
Figure 2:
The three-phase workflow used to develop the ProvCaRe ontology
Figure 3:
Figure 3:
A section of the ProvCaRe ontology class hierarchy

Similar articles

Cited by

References

    1. Landis SC, Amara S.G, Asadullah K, et al. A call for transparent reporting to optimize the predictive value of preclinical research. Nature. 2012;490(7419):187–91. - PMC - PubMed
    1. Collins FS, Tabak L.A. Policy: NIH plans to enhance reproducibility. Nature. 2014;505:612–3. - PMC - PubMed
    1. Freedman LP, Cockburn I.M, Simcoe T.S. The Economics of Reproducibility in Preclinical Research. PLoS Biology. 2015;13(6):e1002165. - PMC - PubMed
    1. Steward O, Popovich P.G, Dietrich W.D, Kleitman N. Replication and reproducibility in spinal cord injury research. Experimental Neurology. 2012;233(2):597–605. - PubMed
    1. Hess KR. Statistical Design Considerations in Animal Studies Published Recently in Cancer Research. Cancer Research. 2011;71(625) - PubMed

LinkOut - more resources