Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 25:1:693836.
doi: 10.3389/fbinf.2021.693836. eCollection 2021.

GeneCloudOmics: A Data Analytic Cloud Platform for High-Throughput Gene Expression Analysis

Affiliations

GeneCloudOmics: A Data Analytic Cloud Platform for High-Throughput Gene Expression Analysis

Mohamed Helmy et al. Front Bioinform. .

Abstract

Gene expression profiling techniques, such as DNA microarray and RNA-Sequencing, have provided significant impact on our understanding of biological systems. They contribute to almost all aspects of biomedical research, including studying developmental biology, host-parasite relationships, disease progression and drug effects. However, the high-throughput data generations present challenges for many wet experimentalists to analyze and take full advantage of such rich and complex data. Here we present GeneCloudOmics, an easy-to-use web server for high-throughput gene expression analysis that extends the functionality of our previous ABioTrans with several new tools, including protein datasets analysis, and a web interface. GeneCloudOmics allows both microarray and RNA-Seq data analysis with a comprehensive range of data analytics tools in one package that no other current standalone software or web-based tool can do. In total, GeneCloudOmics provides the user access to 23 different data analytical and bioinformatics tasks including reads normalization, scatter plots, linear/non-linear correlations, PCA, clustering (hierarchical, k-means, t-SNE, SOM), differential expression analyses, pathway enrichments, evolutionary analyses, pathological analyses, and protein-protein interaction (PPI) identifications. Furthermore, GeneCloudOmics allows the direct import of gene expression data from the NCBI Gene Expression Omnibus database. The user can perform all tasks rapidly through an intuitive graphical user interface that overcomes the hassle of coding, installing tools/packages/libraries and dealing with operating systems compatibility and version issues, complications that make data analysis tasks challenging for biologists. Thus, GeneCloudOmics is a one-stop open-source tool for gene expression data analysis and visualization. It is freely available at http://combio-sifbi.org/GeneCloudOmics.

Keywords: OMICS data; RNA-seq; bioinformatics; data analytics; gene expression analysis; microarray; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
The gene expression profiling workflow. The RNA sequencer produces raw RNA read counts that are aligned on the cell’s genome and processed through the quality control (QC) steps. The raw read counts result from QC are next normalized and analyzed statistically to infer the differential gene expressions (DGEs) or other analyses such as Shannon Entropy, Correlations or PCA. Several bioinformatics analyses can also be performed on the list of DEG for functional inference.
FIGURE 2
FIGURE 2
Schematic Overview of GeneCloudOmics. (A) RNASeq and Microarray data uploading, (B) Data Pre-processing, (C) Transcriptome Data Analysis (e.g. Correlation, DGE analysis and heatmap clustering), and (D) Gene or Protein Bioinformatics Analysis.
FIGURE 3
FIGURE 3
Demonstration of Key Transcriptomic Analysis using GeneCloudOmics. (A–F) using bulk-RNASeq human T-regulatory cell differentiation data, and (G) using single-cell RNASeq mouse distal lung epithelium data. (A): RLE plot of raw and normalized data, showing sample variation reduced after normalization. (B): Comparing transcriptome-wide distribution with six model distributions to select suitable expression cut-off threshold. (C): Between-replicate transcriptome-wide variation visualized by scatter plot. (D): Pairwise Pearson correlation between all samples. (E): Principal component analysis visualizes all sample data points in 2 dimensions. (F): Hierarchical clustering reveals common expression patterns throughout the T cell differentiation process, visualized by heat map of expression level. (G): Random Forest clustering divides single cells according to their developmental stages.
FIGURE 4
FIGURE 4
Demonstration of Gene and Protein Bioinformatics Analysis using GeneCloudOmics. (A) pathway enrichment analysis, (B) gene ontology (GO), (C) protein-protein interaction, and (D) protein phylogenetic analysis, (E) protein pathological analysis, and (F) Protein physicochemical properties (acidity, charge and hydrophobicity) of Scutellarein treated AGS cell lines of gastric cancer proteomics dataset.

Similar articles

Cited by

References

    1. Amberger J. S., Bocchini C. A., Scott A. F., Hamosh A. (2019). OMIM.org: Leveraging Knowledge across Phenotype-Gene Relationships. Nucleic Acids Res. 47, D1038–D1043. 10.1093/nar/gky1151 - DOI - PMC - PubMed
    1. Bateman A. (2019). UniProt: A Worldwide Hub of Protein Knowledge. Nucleic Acids Res. 47, D506–D515. 10.1093/nar/gky1049 - DOI - PMC - PubMed
    1. Beal J. (2017). Biochemical Complexity Drives Log‐normal Variation in Genetic Expression. Eng. Biol. 1, 55–60. 10.1049/enb.2017.0004 - DOI
    1. Bengtsson M., Ståhlberg A., Rorsman P., Kubista M. (2005). Gene Expression Profiling in Single Cells from the Pancreatic Islets of Langerhans Reveals Lognormal Distribution of mRNA Levels. Genome Res. 15, 1388–1392. 10.1101/gr.3820805 - DOI - PMC - PubMed
    1. Borrill P., Ramirez-Gonzalez R., Uauy C. (2016). expVIP: a Customizable RNA-Seq Data Analysis and Visualization Platform. Plant Physiol. 170, 2172–2186. 10.1104/PP.15.01667 - DOI - PMC - PubMed

LinkOut - more resources