Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Feb 7:6:14.
doi: 10.1186/1471-2164-6-14.

Specialized microbial databases for inductive exploration of microbial genome sequences

Affiliations

Specialized microbial databases for inductive exploration of microbial genome sequences

Gang Fang et al. BMC Genomics. .

Abstract

Background: The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects.

Methods: The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented.

Results: Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore http://bioinfo.hku.hk/genochore.html, a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns.

Conclusion: This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Data Schema of the Databases. The core object of the schema is the Genomic_object, as in GenoList. It uses pointers in the sequence that delimits several categories of objects, including protein Coding DNA Sequences (CDSs), RNAs and other objects such as transcription terminators or riboswitches.
Figure 2
Figure 2
Implementation of a Database Curator Page. In order to help users who would participate in the improvement of the database annotation a Curator Page is provided permitting input of updated information. It is available to users after acceptation of their collaboration, through a password protected access. Once data consistency has been verified the new annotations are implemented in the current database.

References

    1. Fumoto M, Miyazaki S, Sugawara H. Genome Information Broker (GIB): data retrieval and comparative analysis system for completed microbial genomes and more. Nucleic Acids Res. 2002;30:66–68. doi: 10.1093/nar/30.1.66. - DOI - PMC - PubMed
    1. Karp PD, Riley M, Saier M, Paulsen IT, Paley SM, Pellegrini-Toole A. The EcoCyc and MetaCyc databases. Nucleic Acids Res. 2000;28:56–59. doi: 10.1093/nar/28.1.56. - DOI - PMC - PubMed
    1. Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, Kalinowski J, Linke B, Rupp O, Giegerich R, Puhler A. GenDB--an open source genome annotation system for prokaryote genomes. Nucleic Acids Res. 2003;31:2187–2195. doi: 10.1093/nar/gkg312. - DOI - PMC - PubMed
    1. Peterson JD, Umayam LA, Dickinson T, Hickey EK, White O. The Comprehensive Microbial Resource. Nucleic Acids Res. 2001;29:123–125. doi: 10.1093/nar/29.1.123. - DOI - PMC - PubMed
    1. Tateno Y, Imanishi T, Miyazaki S, Fukami-Kobayashi K, Saitou N, Sugawara H, Gojobori T. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res. 2002;30:27–30. doi: 10.1093/nar/30.1.27. - DOI - PMC - PubMed

Publication types

LinkOut - more resources