Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 20;19(4):575-592.
doi: 10.1093/bib/bbw139.

Gene co-expression analysis for functional classification and gene-disease predictions

Affiliations

Gene co-expression analysis for functional classification and gene-disease predictions

Sipko van Dam et al. Brief Bioinform. .

Abstract

Gene co-expression networks can be used to associate genes of unknown function with biological processes, to prioritize candidate disease genes or to discern transcriptional regulatory programmes. With recent advances in transcriptomics and next-generation sequencing, co-expression networks constructed from RNA sequencing data also enable the inference of functions and disease associations for non-coding genes and splice variants. Although gene co-expression networks typically do not provide information about causality, emerging methods for differential co-expression analysis are enabling the identification of regulatory genes underlying various phenotypes. Here, we introduce and guide researchers through a (differential) co-expression analysis. We provide an overview of methods and tools used to create and analyse co-expression networks constructed from gene expression data, and we explain how these can be used to identify genes with a regulatory role in disease. Furthermore, we discuss the integration of other data types with co-expression networks and offer future perspectives of co-expression analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example of a co-expression network analysis. First, pairwise correlation is determined for each possible gene pair in the expression data. These pairwise correlations can then be represented as a network. Modules within these networks are defined using clustering analysis. The network and modules can be interrogated to identify regulators, functional enrichment and hub genes. Differential co-expression analysis can be used to identify modules that behave differently under different conditions. Potential disease genes can be identified using a guilt-by-association (GBA) approach that highlights genes that are co-expressed with multiple disease genes.
Figure 2
Figure 2
Hypothetical network explaining inter- and intra-modular hubs and network centrality. The inter-modular hub has a high network centrality, as it is required for the largest number of shortest paths between all possible node pairs. The red line indicates an example of a shortest path through the network between a pair of nodes. Intra-modular hubs (marked with orange) are central to individual modules and usually have high biological relevance.
Figure 3
Figure 3
Changes in gene co-expression patterns that can occur between samples. Differential co-expression can occur as the presence of a module in only one of the sample groups (A), as differences in the structure of the module (B) or as differences in the correlation strength between members of the modules (C). Additionally, differential co-expression can be detected if one larger interconnected module splits into several smaller ones (D) or if a group of genes changes its correlation partners [‘gene hopping’ (E)]. If sample groups are not defined before the differential co-expression analysis, or are unknown, biclustering methods can identify modules unique to a subpopulation of samples by simultaneously classifying the samples into groups in which these modules exist (F).
Figure 4
Figure 4
Strategies for integrating multi-omics data with co-expression analyses. Networks are more informative if they are constructed using expression data specific to the tissue of interest. Genomic variation can be mapped to a co-expression network either by linking suggestive GWAS hits to the genes in the network or by first identifying genetic variants with an effect on gene expression levels (cis- and trans-eQTLs) and then mapping those to the co-expression network. Additional data layers may include TFBSs (based on binding motifs or ChIP-seq/ChIP-chip experiments), miRNA target binding sites (based on in silico predictions or experimental techniques) and established protein–protein interactions. A co-expression network can be used to identify modules, hub genes and for predicting the function of unknown trait-associated genes. Identified modules can be analysed by enrichment analyses to identify overlaying features. Additionally, the research hypothesis can be supported by additional differential expression, co-expression and methylation analyses that can be performed if respective omics data are available for cases and controls for a corresponding trait. eQTL: expression quantitative trait loci; GWAS: genome-wide association study; OMIM: online Mendelian inheritance in man; miRNA: microRNA; PPI: protein–protein interaction; TF: transcription factor; TFBS: TF binding site.

References

    1. Zhao Y, Li H, Fang S, et al.NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res 2016;44:D203–8. - PMC - PubMed
    1. van Dam S, Craig T, de Magalhaes JP.. GeneFriends: a human RNA-seq-based gene and transcript co-expression database. Nucleic Acids Res 2015;43:D1124–32. - PMC - PubMed
    1. Carpenter AE, Sabatini DM.. Systematic genome-wide screens of gene function. Nat Rev Genet 2004;5:11–22. - PubMed
    1. Amar D, Safer H, Shamir R.. Dissection of regulatory networks that are altered in disease via differential co-expression. PLoS Comput Biol 2013;9:e1002955.. - PMC - PubMed
    1. Zeisel A, Munoz-Manchado AB, Codeluppi S, et al.Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 2015;347:1138–42. - PubMed

Publication types