Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Jan 1;28(1):33-6.
doi: 10.1093/nar/28.1.33.

The COG database: a tool for genome-scale analysis of protein functions and evolution

Affiliations

The COG database: a tool for genome-scale analysis of protein functions and evolution

R L Tatusov et al. Nucleic Acids Res. .

Abstract

Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm. nih.gov/COG). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56-83% of the gene products from each of the complete bacterial and archaeal genomes and approximately 35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Representation of the protein sets from complete genomes in the COGs. The data are sorted by the decreasing fraction of proteins included in the COGs (from left to right). Species names: Aae, Aquifex aeolicus; Tma, Thermotoga maritima; Mge, Mycoplasma genitalium; Rpr, Rickettsia prowazekii; Hin, Haemophilus influenzae; Mth, Methanobacterium thermoautotrophicum; Afu, Archaeoglobus fulgidus; Ctr, Chlamydia trachomatis; Mja, Methanococcus jannaschii; Jhp, Helicobacter pylori J strain; Tpa, Treponema pallidum; Bsu, Bacillus subtilis; Eco, Escherichia coli; Hpy, Helicobacter pylori; Pho, Pyrococcus horikoshii; Mpn, Mycoplasma pneumoniae; Cpn, Chlamydia pneumoniae; Ssp, Synechocystis sp.; Mtu, Mycobacterium tuberculosis; Bbu, Borrelia burgdorferi; Sce, Saccharomyces cerevisiae.
Figure 2
Figure 2
Classification of the COGs by functional categories. One-letter abbreviations for the functional categories: J, translation, including ribosome structure and biogenesis; L, replication, recombination and repair; K, transcription; O, molecular chaperones and related functions; M, cell wall structure and biogenesis and outer membrane; N, secretion, motility and chemotaxis; T, signal transduction; P, inorganic ion transport and metabolism; C, energy production and conversion; G, carbohydrate metabolism and transport; E, amino acid metabolism and transport; F, nucleotide metabolism and transport; H, coenzyme metabolism; I, lipid metabolism; D, cell division and chromosome partitioning; R, general functional prediction only; S, no functional prediction.
Figure 3
Figure 3
Distribution of the COGs by the number of phylogenetic lineages. Typically, a lineage is represented by only one species. However, the following pairs of (relatively) close bacterial species were merged and treated as a single entity prior to the COG construction: Mycoplasma genitalium and Mycoplasma pneumoniae, Chlamydia trachomatis and Chlamydia pneumoniae, Escherichia coli and Haemophilus influenzae, and two strains of Helicobacter pylori.

Similar articles

Cited by

References

    1. Neidhardt F.C., Curtiss,R.,III, Ingraham,J.L., Lin,E.C.C., Low,K.B., Magasanik,B., Reznikoff,W.S., Riley,M., Schaechter,M. and Umbarger,H.E. (eds) (1996) Escherichia coli and Salmonella. Cellular and Molecular Biology, 2nd Edn. ASM Press, Washington, DC.
    1. Koonin E.V. (1997) Curr. Biol., 7, R656–R659. - PubMed
    1. Koonin E.V., Mushegian,A.R., Galperin,M.Y. and Walker,D.R. (1997) Mol. Microbiol., 25, 619–637. - PubMed
    1. Fitch W.M. (1970) System. Zool., 19, 99–106. - PubMed
    1. Fitch W.M. (1995) Phil. Trans. R. Soc. Lond. B Biol. Sci., 349, 93–102. - PubMed