Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Apr 20;44(7):e62.
doi: 10.1093/nar/gkv1478. Epub 2016 Jan 14.

CrossHub: a tool for multi-way analysis of The Cancer Genome Atlas (TCGA) in the context of gene expression regulation mechanisms

Affiliations

CrossHub: a tool for multi-way analysis of The Cancer Genome Atlas (TCGA) in the context of gene expression regulation mechanisms

George S Krasnov et al. Nucleic Acids Res. .

Abstract

The contribution of different mechanisms to the regulation of gene expression varies for different tissues and tumors. Complementation of predicted mRNA-miRNA and gene-transcription factor (TF) relationships with the results of expression correlation analyses derived for specific tumor types outlines the interactions with functional impact in the current biomaterial. We developed CrossHub software, which enables two-way identification of most possible TF-gene interactions: on the basis of ENCODE ChIP-Seq binding evidence or Jaspar prediction and co-expression according to the data of The Cancer Genome Atlas (TCGA) project, the largest cancer omics resource. Similarly, CrossHub identifies mRNA-miRNA pairs with predicted or validated binding sites (TargetScan, mirSVR, PicTar, DIANA microT, miRTarBase) and strong negative expression correlations. We observed partial consistency between ChIP-Seq or miRNA target predictions and gene-TF/miRNA co-expression, demonstrating a link between these indicators. Additionally, CrossHub expression-methylation correlation analysis can be used to identify hypermethylated CpG sites or regions with the greatest potential impact on gene expression. Thus, CrossHub is capable of outlining molecular portraits of a specific gene and determining the three most common sources of expression regulation: promoter/enhancer methylation, miRNA interference and TF-mediated activation or repression. CrossHub generates formatted Excel workbooks with the detailed results. CrossHub is freely available athttps://sourceforge.net/projects/crosshub/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
CrossHub workflow. Complementation of ENCODE ChIP-Seq data and Jaspar predictions with TCGA expression correlation analysis allows the user to outline interactions with potential functional impacts to a specific cancer subtype. Similarly, combining miRNA target predictions with gene–miRNA expression correlation profiling (based on TCGA expression data) highlights gene–miRNA interactions, which likely take place for a particular tumor type. Expression-methylation correlation analysis allow identification of hypermethylated CpG sites or regions within promoters or enhancers (annotated with ENCODE) having the greatest potential impact on gene expression. In addition, CrossHub enables conventional differential expression (DE) and methylation analysis.
Figure 2.
Figure 2.
Associations between the promoter hypermethylation score (HMS) and the logarithm of gene expression level changes in tumors (LogFC). Circle colors indicate gene DE reliability score, which is proportional to the absolute values of LogFC and logarithm of false-discovery rate (FDR). Circle size is proportional to square root of total read count for a gene. For all three cancers, a significant increase in the ratio of downregulated genes was observed for genes with positive promoter hypermethylation scores. We selected several HMS thresholds (THMS) to prove the statistical significance of differences between distribution of LogFC for genes with HMSTHMS and HMS < THMS. Vertical dashed lines indicate mean LogFC for these groups. Average LogFC decreases with increasing HMS.
Figure 3.
Figure 3.
Distribution of genes for two parameters: ENCODE ChIP-Seq transcription factor (TF) binding score and Spearman gene–TF expression correlation coefficients (rs; colon cancer TCGA dataset). Two samplings were analyzed: genes participating in the glucose transport and metabolism (top) and genes encoding extracellular proteins (bottom). Genes with no ChIP-Seq evidence of TF binding are marked with zero score. Circle size is proportional to square root of total read count for a gene. Circle color indicates gene expression level change in tumor. The analysis was performed for two TF strongly upregulated in colon cancer: well-known oncogenic protein Myc and CBX3 which is less extensively studied in the context of cancer. We compared distributions of rs between genes that passed and did not pass score thresholds (TS). Several TS were selected: >0 (any positive score), 25th, 50th, 75th and 90th score percentiles. Vertical dashed lines indicate mean values of rs for these groups. For each TS we observed statistically significant difference between the distributions of rs indicating linkage of these characteristics: ChIP-Seq score and TF–gene co-expression.
Figure 4.
Figure 4.
Distribution density of gene–microRNA pairs in expression level correlation coefficients (rs) and miRNA binding site scores according to TargetScan (A and B), DIANA microT (C), miRTarBase (D and E) and overall score according to several algorithms (F). TargetScan (conservative sites), DIANA microT showed the greatest mean rs bias among the analyzed prediction algorithms. Distribution density is slightly asymmetrical for these databases, especially for high scores (d = 0.02–0.045). These areas are marked with an arrow. However, miRNA–gene relationships predicted by several algorithms showed a prominent rs bias (d = 0.109; overall score range 100—225; F). This region represents the maximum number of true miRNA–gene relationships with functional impact. Another region with a significant rs bias (d = 0.052, overall score >400) mainly includes miRNA–gene relationships with strong experimental evidence or weak evidence coupled to miRNA binding site prediction by one or more algorithms.

Similar articles

Cited by

References

    1. Tomczak K., Czerwinska P., Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. 2015;19:A68–A77. - PMC - PubMed
    1. Guo Y., Sheng Q., Li J., Ye F., Samuels D.C., Shyr Y. Large scale comparison of gene expression levels by microarrays and RNAseq using TCGA data. PLoS One. 2013;8:e71462. - PMC - PubMed
    1. Cancer Genome Atlas, N. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. - PMC - PubMed
    1. Cancer Genome Atlas, N. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490:61–70. - PMC - PubMed
    1. Chen Y., McGee J., Chen X., Doman T.N., Gong X., Zhang Y., Hamm N., Ma X., Higgs R.E., Bhagwat S.V., et al. Identification of druggable cancer driver genes amplified across TCGA datasets. PLoS One. 2014;9:e98293. - PMC - PubMed

Publication types