Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2013 Jul;14(4):391-401.
doi: 10.1093/bib/bbs078. Epub 2012 Nov 27.

Comparability and reproducibility of biomedical data

Affiliations
Review

Comparability and reproducibility of biomedical data

Yunda Huang et al. Brief Bioinform. 2013 Jul.

Abstract

With the development of novel assay technologies, biomedical experiments and analyses have gone through substantial evolution. Today, a typical experiment can simultaneously measure hundreds to thousands of individual features (e.g. genes) in dozens of biological conditions, resulting in gigabytes of data that need to be processed and analyzed. Because of the multiple steps involved in the data generation and analysis and the lack of details provided, it can be difficult for independent researchers to try to reproduce a published study. With the recent outrage following the halt of a cancer clinical trial due to the lack of reproducibility of the published study, researchers are now facing heavy pressure to ensure that their results are reproducible. Despite the global demand, too many published studies remain non-reproducible mainly due to the lack of availability of experimental protocol, data and/or computer code. Scientific discovery is an iterative process, where a published study generates new knowledge and data, resulting in new follow-up studies or clinical trials based on these results. As such, it is important for the results of a study to be quickly confirmed or discarded to avoid wasting time and money on novel projects. The availability of high-quality, reproducible data will also lead to more powerful analyses (or meta-analyses) where multiple data sets are combined to generate new knowledge. In this article, we review some of the recent developments regarding biomedical reproducibility and comparability and discuss some of the areas where the overall field could be improved.

Keywords: Analysis pipeline; accuracy; open science; precision; protocol; standardization.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Life cycle of scientific discoveries. The overall cycle is broken down into five different steps. After completion of all steps according to the reproducible guidelines (Table 1), the results would rapidly lead to confirmed (or discarded) discoveries. The confirmed discoveries would then be translated into new knowledge and data supporting novel studies.
Figure 2
Figure 2
Precision-accuracy trade off. Four different protocols are compared. Protocol B exhibits large variance (wide box) with small bias (close to the true value on average) while protocol C has small variance but large bias. Overall, protocol D exhibits good variance-bias trade off and should be prefered.

References

    1. Lyng H, Badiee A, Svendsrud DH, et al. Profound influence of microarray scanner characteristics on gene expression ratios: analysis and procedure for correction. BMC Genomics. 2004;5:10. - PMC - PubMed
    1. Hutson S. Data handling errors spur debate over clinical trial. Nat Med. 2010;16:618. - PubMed
    1. Baggerly KA, Coombes KR. What information should be required to support clinical ‘omics’ publications? Clin Chem. 2011;57:688–90. - PubMed
    1. Yauk CL, Berndt ML, Williams A, et al. Comprehensive comparison of six microarray technologies. Nucleic Acids Res. 2004;32:e124. - PMC - PubMed
    1. Liu F, Jenssen T-K, Trimarchi J, et al. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates. BMC Genomics. 2007;8:153. - PMC - PubMed

Publication types