Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 15;5(8):180448.
doi: 10.1098/rsos.180448. eCollection 2018 Aug.

Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition

Affiliations

Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition

Tom E Hardwicke et al. R Soc Open Sci. .

Abstract

Access to data is a critical feature of an efficient, progressive and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data ('analytic reproducibility'). To investigate this, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), although not all data appeared reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). For 35 of the articles determined to have reusable data, we attempted to reproduce 1324 target values. Ultimately, 64 values could not be reproduced within a 10% margin of error. For 22 articles all target values were reproduced, but 11 of these required author assistance. For 13 articles at least one value could not be reproduced despite author assistance. Importantly, there were no clear indications that original conclusions were seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.

Keywords: interrupted time series; journal policy; meta-science; open data; open science; reproducibility.

PubMed Disclaimer

Conflict of interest statement

M.C.F. was an Associate Editor at the journal Cognition during the study. The other authors have no competing interests.

Figures

Figure 1.
Figure 1.
Proportion of articles with data available statements as a function of submission date across the assessment period. For ease of presentation, circles indicate proportions in 50-day bins with the circle area representing the total number of articles in each bin (but note that the analysis model was fitted to individual articles). Solid red lines represent predictions of an interrupted time-series analysis segmented by pre-policy and post-policy periods. The dashed red line estimates, based on the pre-policy period, the trajectory of data available statement inclusion if the policy had no effect. The model is linear on the logit scale, whereas the y-axis of the figure is on the probability scale, which is a nonlinear transformation of the logit. Confidence bands (red) indicate 95% CIs. Note that the small article numbers in the extremes of the graph are due to long submission-to-publication lag times. Our sample selection was based on the publication date, but it is the submission date which determines whether an article falls within the pre-policy or post-policy period.
Figure 2.
Figure 2.
Counts and percentages for articles in the pre- and post-policy periods with available statements, accessible data, complete data and understandable data. Only accessible, complete and understandable data are considered ‘reusable in principle’. Arrow size represents the proportion of total articles.
Figure 3.
Figure 3.
All 1324 values were checked for reproducibility as a function of article and value type (n = count/proportion; ci = confidence interval; misc = miscellaneous; M = mean/median; df = degrees of freedom; es = effect size; test = test statistic; p = p-value; sd/se = standard deviation/standard error. Bold red X marks indicate non-reproducible values (major errors) and grey circles indicate reproducible values. Symbol size represents the number of values. Both axes are ordered by an increasing number of errors towards the graph origin. The article colours represent the overall outcome: not fully reproducible despite author assistance (red), reproducible with author assistance (orange) and reproducible without author assistance (green). For articles marked within asterisks (*), the analysis could not be completed and there was insufficient information to determine whether original conclusions were affected. In all other cases, it is unlikely that original conclusions were affected.
Figure 4.
Figure 4.
Locus of non-reproducibility based on discrete issues identified in each article. Circles indicate reproducibility issues resolved through author assistance, and X marks indicate unresolved reproducibility issues. Symbol size represents the number of discrete reproducibility issues. Left panel represents articles that were not fully reproducible despite author assistance (some issues may have been resolved but others remain). Right panel represents articles that were reproducible with author assistance (all issues were resolved). Both axes are ordered by an increasing number of discrete reproducibility issues towards the origin.

References

    1. Ioannidis JPA. 2014. How to make more published research true. PLoS Med. 11, e1001747–6 (10.1371/journal.pmed.1001747) - DOI - PMC - PubMed
    1. Munafò MR, et al. 2017. A manifesto for reproducible science. Nat. Hum. Behav. 1, 1–9. (10.1038/s41562-016-0021) - DOI - PMC - PubMed
    1. Nosek BA, et al. 2015. Promoting an open research culture. Science 348, 1422–1425. (10.1126/science.aab2374) - DOI - PMC - PubMed
    1. Goodman SN, Fanelli D, Ioannidis JPA. 2016. What does research reproducibility mean? Sci. Transl. Med. 8, 1–6. (10.1126/scitranslmed.aaf5027) - DOI - PubMed
    1. Stodden V, McNutt M, Bailey DH, Deelman E, Gil Y, Hanson B, Heroux MA, Ioannidis JPA, Taufer M. 2016. Enhancing reproducibility for computational methods. Science 354, 1240–1241. (10.1126/science.aah6168) - DOI - PubMed

LinkOut - more resources