Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Nov 24:1:49-62.

GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

Affiliations

GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

Chris Cheadle et al. Bioinform Biol Insights. .

Abstract

Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.

Results: We have developed (gene set matrix analysis) GSMA as a useful method for the rapid testing of group-wise up- or down-regulation of gene expression simultaneously for multiple lists of genes (gene sets) against entire distributions of gene expression changes (datasets) for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.

Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Screen shots of GSMA scripts implemented in JMP. Upon initiating the start-up of one of the GSMA scripts, the user is prompted to: A. up-load a tab-delimited file of gene expression differences, B. upload a tab delimited gene set query (list of lists), C. name the dataset and query lists, D. after the script finishes running, a matrix of z scores is returned for all lists in the gene set which exceed a pre-set significance threshold for any given dataset.
Figure 2.
Figure 2.
Graphical representation of GSMA significance scores for A. 114/445 genelists derived from the Transfac database [13] rank ordered from high to low on the basis of the GSMA z scores, and B. 30 of the most significant TransFac gene lists which correspond to gene groups whose overall expression was either increased during myotube formation (positive z scores) or increased in myoblasts (negative z scores).
Figure 3.
Figure 3.
2D GSMA results corresponding to 529 pathway-related gene lists tested independently on 12 separate datasets. A. Thumbnail image—heatmap of hierarchical clustering of GSMA z scores. B. Zoom image—highlighting an area of extensive pathway co-regulation shared by many but not all samples. C. GSMA z score data from 3B displayed in column format. D. Example of gene expression differences from a single enriched pathway (human asthma genes) for all samples. Samples were derived from multiple patients and stimulated in culture with lipopolysaccharide (LPS) or ambient particulate matter (APM) for 6 and 20 hours. BAL = bronchial lavage macrophages, MON = monocytes, EPI = airway epithelial cells.
Figure 4.
Figure 4.
Application of GSMA directly to gene expression intensities. A. Bar graph illustrating the distribution of the numbers of specific immuno-dominant genes compiled from a comprehensive compendium of microarray human gene expression data from six key immune cell types [25]. B. Hierarchical clustering of GSMA results in which the average gene expression intensities of the various cell types and treatments tested (vertical axis) were evaluated using the IRIS gene set and assigned a z score value on the basis of enrichment for the immune cell types as shown (horizontal axis). GSMA results for this visualization were clustered by IRIS immune cell type while the experimental cell type (BAL, MON, & EPI) and the treatment (LPS & APM) order was held constant.
Figure 5.
Figure 5.
Characterization of contrasting nuclear run-on (NRO) and total RNA gene expression during a one hour time course (in minutes as indicated) of T cell activation. A. Heat map of hierarchical clustering of differential gene expression of NRO and total RNA up- (red) or down- (green) regulated from their respective baselines. B. 2D GSMA pathway results using the same dataset as in 5A for analysis. C. 2D GSMA transcription factor binding site (TFBS - TransFac) results using the same dataset as in 5A for analysis.
Figure 6.
Figure 6.
Characterization of pathway and TFBS gene set enrichment between nuclear run-on (NRO) and total RNA during a time course of T cell activation. A. Subset of stability regulated genes. B. Comparison of cumulative pathway enrichment between nuclear-run on (NRO) and total RNA during T cell activation. C. Comparison of cumulative transcription factor binding site (TransFac) enrichment between nuclear-run on (NRO) and total RNA during T cell activation. In both 6B & 6C above; -U = the sum of all positively enriched gene lists, -D = the sum of all negatively enriched gene lists, -M = the sum of all enriched pathways. D. Specific TFBS (TransFac) gene set enrichment in total but not NRO RNA for a selected set of stability regulated genes.

Similar articles

Cited by

References

    1. Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003;34(3):267–73. - PubMed
    1. Sweet-Cordero A, Mukherjee S, Subramanian A, You H, Roix JJ, Ladd-Acosta C, Mesirov J, Golub TR, Jacks T. An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet. 2005;37(1):48–55. - PubMed
    1. Cheadle C, Becker KG, Cho-Chung YS, Nesterova M, Watkins T, Wood W, 3rd, Prabhu V, Barnes KC. A rapid method for microarray cross platform comparisons using gene expression signatures. Mol Cell Probes 2006 - PubMed
    1. Cho RJ, Huang M, Campbell MJ, Dong H, Steinmetz L, Sapinoso L, Hampton G, Elledge SJ, Davis RW, Lockhart DJ. Transcriptional regulation and function during the human cell cycle. Nat. Genet. 2001;27(1):48–54. - PubMed
    1. Falcon S, Gentleman R. Using GOstats to test gene lists for GO term association. Bioinformatics. 2007;23(2):257–8. - PubMed

LinkOut - more resources