Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Nov 1;25(21):2780-6.
doi: 10.1093/bioinformatics/btp502. Epub 2009 Aug 18.

Statistical methods for gene set co-expression analysis

Affiliations

Statistical methods for gene set co-expression analysis

YounJeong Choi et al. Bioinformatics. .

Abstract

Motivation: The power of a microarray experiment derives from the identification of genes differentially regulated across biological conditions. To date, differential regulation is most often taken to mean differential expression, and a number of useful methods for identifying differentially expressed (DE) genes or gene sets are available. However, such methods are not able to identify many relevant classes of differentially regulated genes. One important example concerns differentially co-expressed (DC) genes.

Results: We propose an approach, gene set co-expression analysis (GSCA), to identify DC gene sets. The GSCA approach provides a false discovery rate controlled list of interesting gene sets, does not require that genes be highly correlated in at least one biological condition and is readily applied to data from individual or multiple experiments, as we demonstrate using data from studies of lung cancer and diabetes.

Availability: The GSCA approach is implemented in R and available at www.biostat.wisc.edu/ approximately kendzior/GSCA/.

Contact: kendzior@biostat.wisc.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Schematic of the GSCA approach. Shown are expression matrices for a single gene set with nc genes in two biological conditions, T1 and T2; Nk represents the number of arrays in condition k, k = 1, 2.
Fig. 2.
Fig. 2.
GO:0006955 immune response. The upper (lower) panel shows the 30 (8) genes most DC between cancer and normal. Edges represent co-expressions ranging from −1 (blue) to 1 (red). Nodes identified as DE by Rhodes et al. (2002) or Choi et al. (2003) are shaded.
Fig. 3.
Fig. 3.
Study specific GSCA q-values are shown for the 3649 gene sets (upper panels show varying angles of the 3D plot). Red, blue and light blue values correspond to gene sets for which the meta-GSCA q-values are q < 0.1, 0.1 ≤ q < 0.3 and q ≥ 0.3, respectively.
Fig. 4.
Fig. 4.
The average DC between two biological conditions is shown for 50 of the 211 genes in GO:0006955. The genes are ordered by P-values obtained from a hub-gene test (see Section 2.3). Red bars highlight genes with Bonferroni corrected P <0.05.
Fig. 5.
Fig. 5.
GO:0007169 transmembrane receptor protein tyrosine kinase signaling. The six genes most DC between diabetic and normal are shown. Each pair of nodes is connected by eight edges, one for each of the eight studies. Edges represent co-expressions ranging from −1 (blue) to 1 (red).

References

    1. Allison DB, et al. Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 2006;7:55–65. - PubMed
    1. Ariyanayagam-Baksh SM, et al. Malignant blue nevus: a case report and molecular analysis. Am. J. Dermatopathol. 2003;25:21–27. - PubMed
    1. Barry WT, et al. Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics. 2005;21:1943–1949. - PubMed
    1. Barry WT, et al. A statistical framework for testing functional categories in microarray data. Ann. Appl. Stat. 2008;2:286–315.
    1. Beer DG, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002;8:816–824. - PubMed

Publication types

MeSH terms