Gene co-expression analysis for functional classification and gene-disease predictions

Sipko van Dam¹, Urmo Võsa¹, Adriaan van der Graaf¹, Lude Franke¹, João Pedro de Magalhães²

Affiliations

¹ Department of Genetics, UMCG HPC CB50, RB Groningen, Netherlands.
² Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK.

PMID: 28077403
PMCID: PMC6054162
DOI: 10.1093/bib/bbw139

Gene co-expression analysis for functional classification and gene-disease predictions

Sipko van Dam et al. Brief Bioinform. 2018.

. 2018 Jul 20;19(4):575-592.

doi: 10.1093/bib/bbw139.

Authors

Sipko van Dam¹, Urmo Võsa¹, Adriaan van der Graaf¹, Lude Franke¹, João Pedro de Magalhães²

Affiliations

¹ Department of Genetics, UMCG HPC CB50, RB Groningen, Netherlands.
² Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK.

PMID: 28077403
PMCID: PMC6054162
DOI: 10.1093/bib/bbw139

Abstract

Gene co-expression networks can be used to associate genes of unknown function with biological processes, to prioritize candidate disease genes or to discern transcriptional regulatory programmes. With recent advances in transcriptomics and next-generation sequencing, co-expression networks constructed from RNA sequencing data also enable the inference of functions and disease associations for non-coding genes and splice variants. Although gene co-expression networks typically do not provide information about causality, emerging methods for differential co-expression analysis are enabling the identification of regulatory genes underlying various phenotypes. Here, we introduce and guide researchers through a (differential) co-expression analysis. We provide an overview of methods and tools used to create and analyse co-expression networks constructed from gene expression data, and we explain how these can be used to identify genes with a regulatory role in disease. Furthermore, we discuss the integration of other data types with co-expression networks and offer future perspectives of co-expression analysis.

PubMed Disclaimer

Figures

**Figure 1**
Example of a co-expression network analysis. First, pairwise correlation is determined for each possible gene pair in the expression data. These pairwise correlations can then be represented as a network. Modules within these networks are defined using clustering analysis. The network and modules can be interrogated to identify regulators, functional enrichment and hub genes. Differential co-expression analysis can be used to identify modules that behave differently under different conditions. Potential disease genes can be identified using a guilt-by-association (GBA) approach that highlights genes that are co-expressed with multiple disease genes.

**Figure 2**
Hypothetical network explaining inter- and intra-modular hubs and network centrality. The inter-modular hub has a high network centrality, as it is required for the largest number of shortest paths between all possible node pairs. The red line indicates an example of a shortest path through the network between a pair of nodes. Intra-modular hubs (marked with orange) are central to individual modules and usually have high biological relevance.

**Figure 3**
Changes in gene co-expression patterns that can occur between samples. Differential co-expression can occur as the presence of a module in only one of the sample groups (A), as differences in the structure of the module (B) or as differences in the correlation strength between members of the modules (C). Additionally, differential co-expression can be detected if one larger interconnected module splits into several smaller ones (D) or if a group of genes changes its correlation partners [‘gene hopping’ (E)]. If sample groups are not defined before the differential co-expression analysis, or are unknown, biclustering methods can identify modules unique to a subpopulation of samples by simultaneously classifying the samples into groups in which these modules exist (F).

**Figure 4**
Strategies for integrating multi-omics data with co-expression analyses. Networks are more informative if they are constructed using expression data specific to the tissue of interest. Genomic variation can be mapped to a co-expression network either by linking suggestive GWAS hits to the genes in the network or by first identifying genetic variants with an effect on gene expression levels (*cis*- and *trans*-eQTLs) and then mapping those to the co-expression network. Additional data layers may include TFBSs (based on binding motifs or ChIP-seq/ChIP-chip experiments), miRNA target binding sites (based on *in silico* predictions or experimental techniques) and established protein–protein interactions. A co-expression network can be used to identify modules, hub genes and for predicting the function of unknown trait-associated genes. Identified modules can be analysed by enrichment analyses to identify overlaying features. Additionally, the research hypothesis can be supported by additional differential expression, co-expression and methylation analyses that can be performed if respective omics data are available for cases and controls for a corresponding trait. eQTL: expression quantitative trait loci; GWAS: genome-wide association study; OMIM: online Mendelian inheritance in man; miRNA: microRNA; PPI: protein–protein interaction; TF: transcription factor; TFBS: TF binding site.

See this image and copyright information in PMC

References

1. Zhao Y, Li H, Fang S, et al. NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res 2016;44:D203–8. - PMC - PubMed
1. van Dam S, Craig T, de Magalhaes JP.. GeneFriends: a human RNA-seq-based gene and transcript co-expression database. Nucleic Acids Res 2015;43:D1124–32. - PMC - PubMed
1. Carpenter AE, Sabatini DM.. Systematic genome-wide screens of gene function. Nat Rev Genet 2004;5:11–22. - PubMed
1. Amar D, Safer H, Shamir R.. Dissection of regulatory networks that are altered in disease via differential co-expression. PLoS Comput Biol 2013;9:e1002955.. - PMC - PubMed
1. Zeisel A, Munoz-Manchado AB, Codeluppi S, et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 2015;347:1138–42. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

BB/K016741/1/Biotechnology and Biological Sciences Research Council/United Kingdom

LinkOut - more resources

Full Text Sources
Other Literature Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Gene co-expression analysis for functional classification and gene-disease predictions

Affiliations

Gene co-expression analysis for functional classification and gene-disease predictions

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources