Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov;27(11):1843-1858.
doi: 10.1101/gr.216721.116. Epub 2017 Oct 11.

Co-expression networks reveal the tissue-specific regulation of transcription and splicing

Collaborators

Co-expression networks reveal the tissue-specific regulation of transcription and splicing

Ashis Saha et al. Genome Res. 2017 Nov.

Abstract

Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Transcriptome-Wide Network conceptual framework. (A) Schematic of the effect of a splicing regulator on inclusion of a cassette exon and resulting total expression and isoform ratios of the target gene. Splicing factor expression levels can affect splicing of target genes (Sveen et al. 2015). Higher expression of a splicing regulator S (first row) results in relatively more transcripts of isoform-1 and fewer of isoform-2. Total expression level is constant (5), but isoform ratios are different (0.4 and 0.6) as splicing factor S levels change (second row). (B) The (i,j)th cell of the sample covariance matrix contains covariance (Cij) between the ith and jth feature in data. We created a sparse precision matrix Θ (inverse covariance) from the sample covariance matrix using a graphical lasso to estimate the parameters of a Gaussian Markov random field. A nonzero value (Θij) in the precision matrix denotes an edge between the ith feature and jth feature in the network. (C) Edges in a TWN represent diverse relationships between total expression (TE) and isoform ratio (IR) nodes. Dotted rectangles group together isoform ratios for different isoforms of the same gene. Of particular focus are network “hub” nodes; in a TWN, there are four possible hub configurations depending on the node type of the central and neighboring nodes.
Figure 2.
Figure 2.
GTEx Transcriptome-Wide Networks summary and replication. (A) For each tissue, number of edges and number of hub nodes (≥10 neighbors), segmented by the type of nodes connected by each edge. A “TE-IR” hub is a TE node with multiple IR neighbors, and an “IR-TE” hub is an IR node with multiple TE neighbors. (B) Fraction of whole blood TWN edges replicating in an independent RNA-seq data set (DGN) (Battle et al. 2014; Mostafavi et al. 2014).
Figure 3.
Figure 3.
Enrichment of candidate splicing regulators among TWN hubs. (A) In each TWN, the odds ratio and P-value of enrichment among the top 500 TE-IR hub genes for GO annotations reflect RNA binding and RNA splicing. (B) Among consensus TE-IR hubs across all tissues, enrichment for GO annotations reflects RNA binding and RNA splicing functions.
Figure 4.
Figure 4.
Pathway enrichment in TWNs. Tissue colors are matched with tissue names in Figure 2. (A) Per-tissue, the number of Reactome pathways enriched among connected components/total number of tested pathways for that tissue, considering only TE nodes. (B) Enrichment for shared Reactome pathway annotation among gene pairs connected by an edge between two TE nodes. (C) Enrichment for shared Reactome pathway annotation among gene pairs connected by an edge between a TE and an IR node.
Figure 5.
Figure 5.
Cross-tissue comparison of TSN results. (A) Jaccard coefficient quantified on shared edges (upper triangular) and shared nodes (lower triangular) across pairs of TSNs. (B) Gene expression levels, removing factors from BicMix not included in the network, for the genes identified in the TSN for artery–aorta. The y-axis is ordered by similarity to artery–aorta, with a star by the samples from artery–aorta. The colors on the y-axis correspond to the GTEx tissue legend above. The x-axis is ordered by expression similarity (i.e., hierarchical clustering), and hub genes are labeled, with the large hub denoted in bold. (C) TSN for artery–coronary. Node size reflects betweenness centrality of the nodes. Orange nodes reflect replication in the BioCarta acute myocardial infarction (AMI) pathway; orange edges show the neighbors of the AMI pathway nodes.
Figure 6.
Figure 6.
Association of local genetic variants with distant network neighbors. (A) Enrichment of association between rs113305055, a genetic variant near a cross-tissue TWN hub TMEM160, with all isoform ratios genome-wide in artery–tibial. (B) Enrichment of associations between local genetic variants (either the top cis-eVariant or any variant within 20 kb) of each gene, and network neighbors in the TSNs. (C) Enrichment of association between rs115419420, a genetic variant local to CRELD1, with all isoform ratios in skeletal muscle.

Comment in

References

    1. Albert R. 2005. Scale-free networks in cell biology. J Cell Sci 118: 4947–4957. - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. 2000. Gene Ontology: tool for the unification of biology. Nat Genet 25: 25–29. - PMC - PubMed
    1. Auboeuf D, Hönig A, Berget SM, O'Malley BW. 2002. Coordinate regulation of transcription and splicing by steroid receptor coregulators. Science 298: 416–419. - PubMed
    1. Auboeuf D, Dowhan DH, Li X, Larkin K, Ko L, Berget SM, O'Malley BW. 2004. CoAA, a nuclear receptor coactivator protein at the interface of transcriptional coactivation and RNA splicing. Mol Cell Biol 24: 442–453. - PMC - PubMed
    1. Ballouz S, Verleyen W, Gillis J. 2015. Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. Bioinformatics 31: 2123–2130. - PubMed

Publication types