Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012:802:41-53.
doi: 10.1007/978-1-61779-400-1_3.

Strategies to explore functional genomics data sets in NCBI's GEO database

Affiliations

Strategies to explore functional genomics data sets in NCBI's GEO database

Stephen E Wilhite et al. Methods Mol Biol. 2012.

Abstract

The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Screenshot of a GEO DataSet record, data analysis tools, and corresponding GEO Profiles. (A) DataSet Browser search box. (B) Area containing descriptive information about that DataSet, including the title, summary, organism and citation ((27) for this example). (C) Thumbnail image of cluster heatmap. Click the image to be directed to the full interactive cluster from where regions may be selected and exported. (D) Download section containing various file format options; mouseover each option for description of content. (E) Data Analysis Tools options. Select from ‘Find genes’, ‘Compare 2 sets of Samples’, ‘Cluster heatmaps’ and Experiment design and value distribution’. (F) ‘Compare 2 sets of Samples’ analysis. In this example, the user has opted to perform a one-tailed t-test in order to find genes more highly expressed in mouse lung Samples exposed to cigarette smoke, compared to controls. (G) Results of the previous t-test; 98 genes were retrieved in this case. (H) Gene annotation area. (I) ‘Neighbors’ links that connect the targeted profile to genes related by expression pattern (Profile neighbors), sequence similarity (Sequence neighbors) or physical proximity (Chromosome neighbors). (J) Thumbnail image of gene expression profile. (K) Full profile image that in this example depicts how gene Nqo1 is more highly expressed in smoke-exposed Samples compared to controls. Each bar in the chart represents the expression level of Nqo1 in a Sample. The bars at the foot of the chart represent the experimental variables, in this case ‘control’ or ‘cigarette smoke’.
Figure 2
Figure 2
Chromatin immunoprecipitation sequence (ChIP-seq) tracks displayed in NCBI’s Sequence Viewer. Histone H3 lysine 4 trimethylation (H3K4me3) peaks are typically observed at the 5′ end of transcriptionally active genes. In this example, there is a clear peak next to MASP2 in the adult liver cells (top track, GEO Sample GSM537697) but not in the IMR90 cells (lower track, GEO Sample GSM469970).
Figure 3
Figure 3
Screenshot of Search Builder results, demonstrating fixed list terms for the ‘Entry type’ field.

References

    1. http://www.ncbi.nlm.nih.gov/geo/
    1. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30:207–210. - PMC - PubMed
    1. Barrett T, Troup DB, Wilhite SE, et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009;37:D885–D890. - PMC - PubMed
    1. Sayers EW, Barrett T, Benson DA, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2009;37:D5–D15. - PMC - PubMed
    1. http://www.ncbi.nlm.nih.gov/gquery/

Publication types

LinkOut - more resources