Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov;16(11):1531-1533.
doi: 10.1080/15476286.2019.1652525. Epub 2019 Aug 12.

Big data to knowledge: common pitfalls in transcriptomics data analysis and representation

Affiliations

Big data to knowledge: common pitfalls in transcriptomics data analysis and representation

Maryam Abedi et al. RNA Biol. 2019 Nov.

Abstract

The omics technologies provide an invaluable opportunity to employ a global view towards human diseases. However, the appropriate translation of big data to knowledge remains a major challenge. In this study, we have performed quality control assessments for 91 transcriptomics datasets deposited in gene expression omnibus database and also have evaluated the publications derived from these datasets. This survey shows that drawbacks in the analyses and reports of transcriptomics studies are more common than one may assume. This report is concluded with some suggestions for researchers and reviewers to enhance the minimal requirements for gene expression data generation, analysis and report.

Keywords: Big data; data analysis; differentially expressed gene; quality control; transcriptomics.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The framework followed in this study is depicted. For all datasets quality control assessment was performed using PCA (a). The publications derived from the datasets were inspected for the accuracy of data analysis and representation (b–g). QC: quality control, PCA: principal component analysis. DEGs: differentially expressed genes.

References

    1. Clough E., Barrett T. (2016) The Gene Expression Omnibus Database. In: Mathé E., Davis S. (eds) Statistical Genomics. Methods in Molecular Biology, vol 1418. Humana Press, New York, NY. - PMC - PubMed
    1. Wickham H. Ggplot2: elegant graphics for data analysis. Springer-Verlag New York 2009.
    1. R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/.
    1. Mootha VK, Lindgren CM, Eriksson KF, et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003. July;34(3):267–273. - PubMed
    1. Rabieian R, Moein S, Khanahmad H, et al. Transcriptional noise in intact and TGF-beta treated human kidney cells; the importance of time-series designs. Cell Biol Int. 2018. Sep;42(9):1265-1269. - PubMed