Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan;43(Database issue):D270-6.
doi: 10.1093/nar/gku1152. Epub 2014 Nov 14.

MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data

Affiliations

MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data

Ikuo Uchiyama et al. Nucleic Acids Res. 2015 Jan.

Abstract

The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overview of the data construction procedure in MBGD. Precomputed ortholog tables are colored in light yellow and user-generated ortholog tables are colored in light pink. Three methods (DomClust followed by DomRefine, DomClust only and MergeTree) to create these ortholog tables are shown with different arrows. MergeTree is a program for adding genomes incrementally to an existing ortholog table (base cluster), and thus is represented by two arrows: a base cluster is shown by a solid arrow and an added genome is shown by a broken arrow.
Figure 2.
Figure 2.
Modification of the domain-level classification in the standard ortholog table by DomRefine. (A) An example of modification by DomRefine. Here, two clusters A and B are merged into a new cluster AB. In this case, the number of clusters is reduced from two to one (cluster-level modification) and the numbers of domain-reorganized genes and of classification-changed genes are two and four, respectively (gene-level modification). (B) The effect of cluster-level modification by DomRefine. (C) The effect of gene-level modification by DomRefine.
Figure 3.
Figure 3.
An example of domain reorganization by DomRefine. Shown are the domain organizations of the gene entry hdn:HDEN_1124 in the MBGD gene information pages in the version 2014-01 (A; without refinement) and 2014-02 (B; after refinement). In each figure, the first line indicates the domain organization in MBGD and the subsequent lines indicate the domains identified by HMMER search against the databases included in InterPro (19).
Figure 4.
Figure 4.
An example session of the MyMBGD analysis, where comparison of Staphylococcus aureus genomes was performed focusing on two strains, 930918-3 and D30, which are indicated with ‘a’ and ‘b’, respectively. (A) The MyMBGD interface for specifying a set of genomes in taxon-specific comparison mode. Complete genomes are shown in light yellow and draft genomes are shown in light blue. (B) Occurrence-pattern display in which ortholog groups that are present in strain D30 and absent in strain 930918-3 are extracted and summarized according to occurrence pattern. The occurrence pattern corresponding to the transposon-like cluster containing the FtsK/SpoIIIE family protein is indicated with ‘x’. (C) Genome region map comparison viewer showing gene order conservation around the ortholog group of the FtsK/SpoIIIE family protein. Here, orthologous genes are drawn in the same color and pattern.

References

    1. Loman N.J., Constantinidou C., Chan J.Z., Halachev M., Sergeant M., Penn C.W., Robinson E.R., Pallen M.J. High-throughput bacterial genome sequencing: an embarrassment of choice, a world of opportunity. Nat. Rev. Microbiol. 2012;10:599–606. - PubMed
    1. Chan J.Z., Pallen M.J., Oppenheim B., Constantinidou C. Genome sequencing in clinical microbiology. Nat. Biotechnol. 2012;30:1068–1071. - PubMed
    1. Tatusova T., Ciufo S., Fedorov B., O'Neill K., Tolstoy I. RefSeq microbial genomes database: new representation and annotation strategy. Nucleic Acids Res. 2014;42:D553–D559. - PMC - PubMed
    1. Uchiyama I. MBGD: microbial genome database for comparative analysis. Nucleic Acids Res. 2003;31:58–62. - PMC - PubMed
    1. Uchiyama I. Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes. Nucleic Acids Res. 2006;34:647–658. - PMC - PubMed

Publication types