Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Mar 2;101(9):2981-6.
doi: 10.1073/pnas.0308661100. Epub 2004 Feb 18.

Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data

Affiliations

Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data

Amos Tanay et al. Proc Natl Acad Sci U S A. .

Abstract

The dissection of complex biological systems is a challenging task, made difficult by the size of the underlying molecular network and the heterogeneous nature of the control mechanisms involved. Novel high-throughput techniques are generating massive data sets on various aspects of such systems. Here, we perform analysis of a highly diverse collection of genomewide data sets, including gene expression, protein interactions, growth phenotype data, and transcription factor binding, to reveal the modular organization of the yeast system. By integrating experimental data of heterogeneous sources and types, we are able to perform analysis on a much broader scope than previous studies. At the core of our methodology is the ability to identify modules, namely, groups of genes with statistically significant correlated behavior across diverse data sources. Numerous biological processes are revealed through these modules, which also obey global hierarchical organization. We use the identified modules to study the yeast transcriptional network and predict the function of >800 uncharacterized genes. Our analysis framework, SAMBA (Statistical-Algorithmic Method for Bicluster Analysis), enables the processing of current and future sources of biological information and is readily extendable to experimental techniques and higher organisms.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Integrated analysis of genomic data. Data items (A) from diverse sources of biological information are transformed to properties of genes and relations among genes/proteins, together generating a genes-properties bipartite graph (B). The graph is represented here schematically as a collection of gene nodes (lower, gray) and property nodes (upper) of different types (yellow, TF binding; blue, knockout phenotypes; green/red, expression profiles; gray, protein interactions). The graph edges represent the (probabilistic) assignment of a property to a gene, with edge weights (not shown here) representing the statistical strength of the assignment. A module (marked by the red oval) corresponds to a set of genes and a set of properties with higher than expected internal degrees. As an example, a real module (C) is represented by a matrix of genes by properties. Different types of properties are color-coded differently (using the same color coding as in B) with shading to indicate the strength of property assignment (weak-strong binding, low-high phenotype sensitivity, down-up expression regulation). Modules are annotated by testing the enrichment of their genes' GO annotations. The module shown here is strongly enriched with amino acid metabolism genes and more specifically with arginine-related genes. The integrative power of samba is exemplified by the inclusion of YOR302W in the module, based on the phenotype and the TF binding properties in the module, even though its expression profile is not sufficiently correlated to the module profile. Indeed, YOR302W is CPA1's upstream ORF and is known to function in its translation regulation (16).
Fig. 2.
Fig. 2.
Functional modules and their TFs in the yeast system. Modules with significant functional enrichment for a particular process (P < 0.01) are grouped and plotted as an oval with the process name. TFs with binding profiles associated with any of these modules are marked as gray circles and connected to the associated process. Modules may be enriched in more than one process and thus contribute to several regions in the map. The thickness of the connecting lines is inversely proportional to the P value of the functional enrichment in the associated module. The map was automatically generated by samba using no prior biological knowledge. Met, metabolism; Tran, transport. An interactive version of this figure is available at www.cs.tau.ac.il/~rshamir/samba.
Fig. 3.
Fig. 3.
Hierarchical organization of the yeast molecular network. The module graph was generated by connecting two modules (small ovals) if more than one-third of the genes in one (the smaller) are present in the other. We used our module annotations to manually classify regions in the graph (shown as shaded large ovals). The graph reflects a hierarchical organization that arranges modules in clusters. Some of the clusters (e.g., protein biosynthesis) are organized in more than two hierarchical levels: large modules are composed of several smaller modules, giving a star-like topology.

References

    1. Ren, B., Robert, F., Wyrick, J. J., Aparicio, O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., et al. (2000) Science 290, 2306-2309. - PubMed
    1. Iyer, V., Horak, C., Scafe, C., Botstein, D., Snyder, M. & Brown, P. (2001) Nature 409, 533-538. - PubMed
    1. Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I., et al. (2002) Science 298, 799-804. - PubMed
    1. Giaever, G., Chu, A. M., Ni, L., Connelly, C., Riles, L., Veronneau, S., Dow, S., Lucau-Danila, A., Anderson, K., Andre, B., et al. (2002) Nature 418, 387-391. - PubMed
    1. Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D., Moore, L., Adams, S. L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., et al. (2002) Nature 415, 180-183. - PubMed

Publication types

LinkOut - more resources