Analysis validation has been neglected in the Age of Reproducibility
- PMID: 30532167
- PMCID: PMC6301703
- DOI: 10.1371/journal.pbio.3000070
Analysis validation has been neglected in the Age of Reproducibility
Abstract
Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique patterns of nonindependence in every biological dataset. We advocate that analyses should be evaluated with known-truth simulations that capture biological reality, a process we call "analysis validation." We review the process of validation and suggest criteria that a validation project should meet. We find that different fields of science have historically failed to meet all criteria, and we suggest ways to implement meaningful validation in training and practice.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
References
-
- Leek JT, Jager LR. Is Most Published Research Really False? Annual Review of Statistics and Its Application. Annual Reviews; 2017;4: 109–122. 10.1146/annurev-statistics-060116-054104 - DOI
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous