Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2003 Sep-Dec;9(9-12):185-92.

Using genomic databases for sequence-based biological discovery

Affiliations
Review

Using genomic databases for sequence-based biological discovery

Andreas D Baxevanis. Mol Med. 2003 Sep-Dec.

Abstract

The inherent potential underlying the sequence data produced by the International Human Genome Sequencing Consortium and other systematic sequencing projects is, obviously, tremendous. As such, it becomes increasingly important that all biologists have the ability to navigate through and cull important information from key publicly available databases. The continued rapid rise in available sequence information, particularly as model organism data is generated at breakneck speed, also underscores the necessity for all biologists to learn how to effectively make their way through the expanding "sequence information space." This review discusses some of the more commonly used tools for sequence discovery; tools have been developed for the effective and efficient mining of sequence information. These include LocusLink, which provides a gene-centric view of sequence-based information, as well as the 3 major genome browsers: the National Center for Biotechnology Information Map Viewer, the University of California Santa Cruz Genome Browser, and the European Bioinformatics Institute's Ensembl system. An overview of the types of information available through each of these front-ends is given, as well as information on tutorials and other documentation intended to increase the reader's familiarity with these tools.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Results of a LocusLink query, using “MLH1” as the search term. The report returns information on the MLH1 gene, as well as for related genes. A brief description is given for each found locus, as well as its chromosomal location. The colored alphabet blocks found to the right of each entry are explained in detail in the text.
Figure 2
Figure 2
A PubMed window showing papers on MLH1. Each entry gives the names of the authors, the title of the paper, and the citation information. The abstract of each paper can be found by clicking on the hyperlinked list of authors.
Figure 3
Figure 3
Information from Online Mendelian Inheritance in Man (OMIM) for the MLH1 gene. (A) OMIM entries begin with general information on the gene; the section marked Text provides an “executive summary” of relevant information on the gene and is curated on a regular basis by experts in the field. (B) For each gene where information is available, a list of allelic variants can be obtained; clicking on any of the entries provides more detailed information on that particular variant. See text for details.
Figure 4
Figure 4
A UniGene display showing information on the MLH1 cluster. Each UniGene entry contains information on protein similarities, mapping data, expression information, and links to the mRNA and EST sequences comprising the cluster. See text for details.
Figure 5
Figure 5
The HomoloGene entry for MLH1. Here, both calculated and curated orthologs to the MLH1 gene in a number of organisms are shown, as well as the percent identity between human MLH1 and its counterpart in other organisms.
Figure 6
Figure 6
Single nucleotide polymorphism (SNP) data for human MLH1. The upper portion of the figure presents the gene model (position of introns and exons) and a graphical overview of where the various known SNPs for this gene are located. The table provides more detailed information about each characterized SNP. See text for details.
Figure 7
Figure 7
The NCBI Map Viewer home page. From this page, users can select any of the available organisms for which map information is available and perform targeted queries (by gene, location, or any of a number of other criteria). Information on constructing queries can be found by following the Help hyperlink in the upper right.
Figure 8
Figure 8
The results of an NCBI Map Viewer search, using “human” as the organism and “MLH1” as the query. The query returned 3 matches, one of which is to the MLH1 locus. See text for details.
Figure 9
Figure 9
The default map view. Three maps are displayed: the cytogenetic gene map, the UniGene cluster map, and the “Genes_seq” map (known and putative genes that have been placed as a result of alignments of mRNAs to individual contigs). See text for details.
Figure 10
Figure 10
The Maps & Options window. This control panel is used to select from the available maps and change the order in which the maps are displayed.
Figure 11
Figure 11
A new map view for the MLH1 gene, using the options shown in Figure 10. Notice that the Variation map showing all known SNPs is now the “master map.” See text for details.
Figure 12
Figure 12
The UCSC Genome Browser. The region shown is for the human MLH1 gene. The overall organization is as a series of “tracks” that go from left to right, as opposed to the NCBI maps that go from top to bottom. The appearance of each track can be customized so that information appears at various densities; individual tracks can be selected or de-selected using toggles that appear below the graphic. Controls at the top of the window can be used to either zoom in or out or navigate 5′ or 3′ of the featured area.
Figure 13
Figure 13
The Ensembl browser. The region shown is for the human MLH1 gene. The page begins with a chromosomal view, and then moves through various levels of detail. Controls in the section marked Detailed View can be used to either zoom in or out or navigate 5′ or 3′ of the featured area.

References

    1. Collins FS, Green ED, Guttmacher AE, Guyer MS. A vision for the future of genomics research. Nature. 2003;422:835–47. - PubMed
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res. 2003;31:23–7. - PMC - PubMed
    1. Baxevanis AD. Information retrieval from biological databases. In: Bioinformatics: a practical guide to the analysis of genes and proteins 2nd edition. Baxevanis AD and Ouellette BFF (eds.) John Wiley and Sons, New York, pp. 155–85.
    1. Hamosh A, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2002;30:52–5. - PMC - PubMed
    1. Wolfsberg TG, Landsman D. Expressed sequence tags. In: Bioinformatics: a practical guide to the analysis of genes and proteins 2nd edition. Baxevanis AD and Ouellette BFF (eds.) John Wiley and Sons, New York, pp. 283–302.

LinkOut - more resources