Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2003;4(8):115.
doi: 10.1186/gb-2003-4-8-115. Epub 2003 Jul 16.

Comparative genomics of Archaea: how much have we learned in six years, and what's next?

Affiliations
Review

Comparative genomics of Archaea: how much have we learned in six years, and what's next?

Kira S Makarova et al. Genome Biol. 2003.

Abstract

Archaea comprise one of the three distinct domains of life (with bacteria and eukaryotes). With 16 complete archaeal genomes sequenced to date, comparative genomics has revealed a conserved core of 313 genes that are represented in all sequenced archaeal genomes, plus a variable 'shell' that is prone to lineage-specific gene loss and horizontal gene exchange. The majority of archaeal genes have not been experimentally characterized, but novel functional pathways have been predicted.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The archaeal gene core: changes resulting from the appearance of new genome sequences. Black bars indicate the current set of pan-archaeal genes (313 COGs); gray indicates COGs that are not part of the current pan-archaeal core but are seen to be conserved after the addition of the given genome sequence. The genomes are listed from left to right in chronological order of release of the complete sequence; species name abbreviations are as in Table 1.
Figure 2
Figure 2
Functional breakdown of genes within the conserved archaeal core. 'Universal' indicates genes with orthologs in both bacteria and eukaryotes; 'eukaryotic', genes with orthologs only in eukaryotes; 'bacterial', genes with orthologs only in bacteria; 'archaeal', genes without non-archaeal orthologs. The data on orthology and functional classification are derived from the COGs.
Figure 3
Figure 3
The most parsimonious scenario for the evolution of the main lineages of life. The red numbers in ovals near the internal nodes show the size of the reconstructed gene sets of the respective ancestral forms. Green numbers show gene gains and brown numbers gene losses assigned to each of the branches in the tree. LUCA, last universal common ancestor.
Figure 4
Figure 4
Functional breakdown of genes in each of the sequenced archaeal genomes. The data are from COGs; species name abbreviations are as in Table 1.
Figure 5
Figure 5
Prediction of gene functions in archaea by genomic context analysis. (a) The superoperon coding for the predicted archaeal exosome (see [88]). (b) The partially conserved gene neighborhood coding for the predicted repair system found in archaeal and bacterial thermophiles (see [59] for details). (c-e) Predicted operons containing uncharacterized genes in the neighborhood of genes from the following COGs: COG1594, DNA-directed RNA polymerase, subunit M, and transcription elongation factor TFIIS (RPB9); COG0592, encoding a DNA polymerase sliding clamp subunit (PCNA ortholog); COG1631, ribosomal protein L44E; COG1095, DNA-directed RNA polymerase, subunit E' (RPB7); COG2093, DNA-directed RNA polymerase, subunit E" (RPE2); COG2004, ribosomal protein S24E; COG1709, transcriptional regulator; COG3425, 3-hydroxy-3-methylglutaryl CoA synthase (PksG); COG0183, acetyl-CoA acetyltransferase (Fad A/PaaJ orthologs). UC, uncharacterized, shown by white arrows. Species abbreviations are as in Table 1. Genes are shown not to scale and are denoted by their respective genes names (some are discussed further in the text); arrows indicate the direction of transcription. A solid line connects genes in a predicted operon. Species that have the same operon organization as the listed species are indicated in parentheses. Orthologous genes are aligned. Genes with similar general functions are shown by the same shading. Broken lines show that genes are in the same predicted operon but are not adjacent. Small arrows indicate the presence of additional functionally related genes in the same predicted operon; these genes are not shown for lack of space.
Figure 6
Figure 6
Lineage-specific expansions of paralogous gene families in archaea. The vertical axis shows the number of members of the indicated COGs. (a) COG0477, permeases of the major facilitator superfamily; COG0531, amino-acid transporters. (b) COG1145, ferredoxin. (c) COG2101, TATA-box binding protein (TBP), a component of transcription initiation factors TFIID and TFIIIB; COG1405, Brf1 subunit of transcription-initiation factor TFIIIB and transcription-initiation factor TFIIB. (d) COG1708, 'minimal' nucleotidyltransferase catalytic subunit; COG2250, 'minimal' nucleotidyltransferase accessory subunit. Species abbreviations are as in Table 1.

References

    1. Woese CR, Fox GE. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci USA. 1977;74:5088–5090. - PMC - PubMed
    1. Fox GE, Stackebrandt E, Hespell RB, Gibson J, Maniloff J, Dyer TA, Wolfe RS, Balch WE, Tanner RS, Magrum LJ, et al. The phylogeny of prokaryotes. Science. 1980;209:457–463. - PubMed
    1. Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA. 1990;87:4576–4579. - PMC - PubMed
    1. Woese CR, Gupta R. Are archaebacteria merely derived 'prokaryotes'? Nature. 1981;289:95–96. - PubMed
    1. Mayr E. Two empires or three? Proc Natl Acad Sci USA. 1998;95:9720–9723. - PMC - PubMed