Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2006:411:352-69.
doi: 10.1016/S0076-6879(06)11019-8.

Gene expression omnibus: microarray data storage, submission, retrieval, and analysis

Affiliations
Review

Gene expression omnibus: microarray data storage, submission, retrieval, and analysis

Tanya Barrett et al. Methods Enzymol. 2006.

Abstract

The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information archives and freely distributes high-throughput molecular abundance data, predominantly gene expression data generated by DNA microarray technology. The database has a flexible design that can handle diverse styles of both unprocessed and processed data in a Minimum Information About a Microarray Experiment-supportive infrastructure that promotes fully annotated submissions. GEO currently stores about a billion individual gene expression measurements, derived from over 100 organisms, submitted by over 1500 laboratories, addressing a wide range of biological phenomena. To maximize the utility of these data, several user-friendly web-based interfaces and applications have been implemented that enable effective exploration, query, and visualization of these data at the level of individual genes or entire studies. This chapter describes how data are stored, submission procedures, and mechanisms for data retrieval and query. GEO is publicly accessible at http://www.ncbi.nlm.nih.gov/projects/geo/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic diagram of the relations between GEO Platform, Sample, DataSet, and Profiles. For each gene on a Platform, multiple Sample measurement values are generated. Related Samples constitute a DataSet, from which multiple gene expression profile entities are generated.
Figure 2
Figure 2
Screenshot of a typical DataSet record GDS877 (Gonzalez et al., 2005). The record includes a summary of the experiment, links to related records and publications, subset designations and classifications, download options, and access to mining features such as cluster heat maps and ‘Query group A vs B’ tool.
Figure 3
Figure 3
Screenshot of Entrez GEO Profiles retrieval results; each entity includes sequence identifier and DataSet information, and a thumbnail profile image. Links to other Entrez databases or related profiles are provided above the thumbnail image. The expanded profile chart depicts values (bars) and rank (squares) information for the crystallin gene across each Sample in GEO DataSet GDS877 (Gonzalez et al., 2005). Experimental subset groupings are reflected in labels at foot of chart.
Figure 4
Figure 4
Schematic overview of the query workflow, and how the various features and tools are interlinked.

References

    1. Altschul SF, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Ball C, et al. Microarray Data Standards: An Open Letter. PLoS Biol. 2004;2:23–24.
    1. Barrett T, et al. NCBI GEO: mining millions of expression profiles - database and tools. Nucleic Acids Res. 2005;33:D562–D566. - PMC - PubMed
    1. Brazma A, et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001;29:365–371. - PubMed
    1. Brockington M, et al. Localization and functional analysis of the LARGE family of glycosyltransferases: significance for muscular dystrophy. Hum Mol Genet. 2005;14(5):657–665. - PubMed

LinkOut - more resources