Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 20;11(10):1231.
doi: 10.3390/genes11101231.

Genome-Wide Co-Expression Distributions as a Metric to Prioritize Genes of Functional Importance

Affiliations

Genome-Wide Co-Expression Distributions as a Metric to Prioritize Genes of Functional Importance

Pâmela A Alexandre et al. Genes (Basel). .

Abstract

Genome-wide gene expression analysis are routinely used to gain a systems-level understanding of complex processes, including network connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we developed a computational pipeline to assign to every gene its pair-wise genome-wide co-expression distribution to one of 8 template distributions shapes varying between unimodal, bimodal, skewed, or symmetrical, representing different proportions of positive and negative correlations. We then used a hypergeometric test to determine if specific genes (regulators versus non-regulators) and properties (differentially expressed or not) are associated with a particular distribution shape. We applied our methodology to five publicly available RNA sequencing (RNA-seq) datasets from four organisms in different physiological conditions and tissues. Our results suggest that genes can be assigned consistently to pre-defined distribution shapes, regarding the enrichment of differential expression and regulatory genes, in situations involving contrasting phenotypes, time-series, or physiological baseline data. There is indeed a striking additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches. Our method can be applied to extract further information from transcriptomic data and help uncover the molecular mechanisms involved in the regulation of complex biological process and phenotypes.

Keywords: correlated gene expression; gene regulation; transcriptome analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Distribution shapes template. Individual genes were assigned to one of these eight distributions shapes based on their correlation coefficients to all other expressed genes. Distributions were based on the proportion of correlations falling in each of the eight 0.25-bins of the −1 to +1 range.
Figure 2
Figure 2
Proportion of genes falling in each template shape in datasets with contrasting phenotypes: cattle feed efficiency (A) and cattle puberty (B). The proportion of genes classified as differentially expressed (DE), regulator (REG), or both (DE-REG) are compared to the overall (All) proportion of genes within each shape.
Figure 3
Figure 3
Proportion of genes falling in each template shape in contrasting phenotypic conditions within cattle datasets: low feed efficiency (A), high feed efficiency (B), pre-puberty (C), and post-puberty (D). The proportion of genes classified as DE, REG, or DE-REG are compared to the overall (All) proportion of genes within each shape.
Figure 4
Figure 4
Proportion of genes falling in each template shape in time-series datasets: Drosophila embryogenesis (A) and Duck Preadipocyte (B). The proportion of genes classified as DE, REG, or DE-REG are compared to the overall (All) proportion of genes within each shape. In the Drosophila dataset, DE genes were clustered into four groups (down/down, down/up, up/down, and up/up); refer to methods for more information.
Figure 5
Figure 5
Proportion of genes falling in each template shape in the human dataset. The proportion of genes classified as DE, REG, or DE-REG are compared to the overall (All) proportion of genes within each shape.

References

    1. Swami M. Networking complex traits. Nat. Rev. Genet. 2009;10:2566. doi: 10.1038/nrg2566. - DOI
    1. Hudson N.J., Dalrymple B.P., Reverter A. Beyond differential expression: The quest for causal mutations and effector molecules. BMC Genom. 2012;13:356. doi: 10.1186/1471-2164-13-356. - DOI - PMC - PubMed
    1. Mar J.C., Matigian N.A., Mackay-Sim A., Mellick G.D., Sue C.M., Silburn P.A., McGrath J.J., Quackenbush J., Wells C.A. Variance of Gene Expression Identifies Altered Network Constraints in Neurological Disease. PLoS Genet. 2011;7:e1002207. doi: 10.1371/journal.pgen.1002207. - DOI - PMC - PubMed
    1. Barabási A.-L., Oltvai Z.N. Network biology: Understanding the cell’s functional organization. Nat. Rev. Genet. 2004;5:101–113. doi: 10.1038/nrg1272. - DOI - PubMed
    1. Hudson N.J., Reverter A., Wang Y.H., Greenwood P.L., Dalrymple B.P. Inferring the Transcriptional Landscape of Bovine Skeletal Muscle by Integrating Co-Expression Networks. PLoS ONE. 2009;4:e7249. doi: 10.1371/journal.pone.0007249. - DOI - PMC - PubMed

LinkOut - more resources