Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jun;27(6):1449-59.
doi: 10.1093/molbev/msq033. Epub 2010 Feb 1.

Identification and genomic analysis of transcription factors in archaeal genomes exemplifies their functional architecture and evolutionary origin

Affiliations

Identification and genomic analysis of transcription factors in archaeal genomes exemplifies their functional architecture and evolutionary origin

Ernesto Pérez-Rueda et al. Mol Biol Evol. 2010 Jun.

Abstract

Archaea, which represent a large fraction of the phylogenetic diversity of organisms, are prokaryotes with eukaryote-like basal transcriptional machinery. This organization makes the study of their DNA-binding transcription factors (TFs) and their transcriptional regulatory networks particularly interesting. In addition, there are limited experimental data regarding their TFs. In this work, 3,918 TFs were identified and exhaustively analyzed in 52 archaeal genomes. TFs represented less than 5% of the gene products in all the studied species comparable with the number of TFs identified in parasites or intracellular pathogenic bacteria, suggesting a deficit in this class of proteins. A total of 75 families were identified, of which HTH_3, AsnC, TrmB, and ArsR families were universally and abundantly identified in all the archaeal genomes. We found that archaeal TFs are significantly small compared with other protein-coding genes in archaea as well as bacterial TFs, suggesting that a large fraction of these small-sized TFs could supply the probable deficit of TFs in archaea, by possibly forming different combinations of monomers similar to that observed in eukaryotic transcriptional machinery. Our results show that although the DNA-binding domains of archaeal TFs are similar to bacteria, there is an underrepresentation of ligand-binding domains in smaller TFs, which suggests that protein-protein interactions may act as mediators of regulatory feedback, indicating a chimera of bacterial and eukaryotic TFs' functionality. The analysis presented here contributes to the understanding of the details of transcriptional apparatus in archaea and provides a framework for the analysis of regulatory networks in these organisms.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.
FIG. 1.
Flowchart showing the different steps involved in the identification of high confidence set of archaeal TFs. Branch points on the vertical line from top to bottom correspond to the stage at which a particular step was taken in the process of obtaining a cleaner dataset.
F<sc>IG</sc>. 2.
FIG. 2.
a) Distribution of TFs identified in 52 archaeal genomes. Nanoarchaeum equitans (Neq), Haloarcula marismortui (Hma), Methanospirillum hungatei (Mhu), and Methanosarcina acetivorans C2A (Mac) are indicated as a reference. On x axis, genomes are sorted from smallest to largest size and on y axis the number of TFs is plotted. A linear regression was calculated using the Pearson correlation (r2) between the number of genes and the total number of TFs. b) Proportion of TFs in all the archaeal genomes. Proportion of TFs was calculated as the fraction of ORFs encoding for TFs and plotted against the total number of ORFs for each genome. Pyrococcus horikoshii (pho) and Pyrococcus abyssi (pab) are indicated as a reference. On x axis, genomes are sorted from smallest to largest size and on y axis, the fraction of TFs is plotted.
F<sc>IG</sc>. 3.
FIG. 3.
Distribution of amino acid sequence lengths for TFs. On x axis, the intervals of protein size are shown and on y axis, the normalized frequency of TFs per interval is shown. Thousand groups of 3,918 protein sequences were randomly retrieved from archaeal genome sequences to compare the length distribution of TFs against other protein-coding genes. In each length internal, bars marked as random represent the proportion of proteins in an interval ± their standard deviations from the average in the random samples.
F<sc>IG</sc>. 4.
FIG. 4.
Abundance of TF families in archaeal genomes. Proportion of TFs in each family was calculated as the fraction of total TFs identified that belonged to a particular family. The families are displayed from largest to smallest size. Families with less than 20 members were not displayed as they corresponded to less than 6% of the total dataset.
F<sc>IG</sc>. 5.
FIG. 5.
Clustering of TF families and archaeal genomes. A hierarchical centroid linkage-clustering algorithm was applied with uncentered correlation as the similarity measure and complete linkage (Eisen et al. 1998). Brackets indicate the clusters identified by using a correlation value ≥0.6. Nomenclature is as follows: Crenarchaea (C); Euryarchaea (E); Korarchaeota (K), and Nanoarchaeum (N).
F<sc>IG</sc>. 6.
FIG. 6.
Distribution of archaeal TFs shared by the three cellular domains, archaea, bacteria, and eukarya. Pie chart showing the distribution of archaeal TF homologues identified in different domains of life; Blast searches were performed between all TFs previously identified against total sequences of bacterial and eukaryotic genomes. A protein was considered as homologue if the alignment covered at least ≥60% of the query sequence, with an E value ≤10−6.

Similar articles

Cited by

References

    1. Aravind L, Anantharaman V, Balaji S, Babu MM, Iyer LM. The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol Rev. 2005;29:231–262. - PubMed
    1. Aravind L, Koonin EV. DNA-binding proteins and evolution of transcription regulation in the archaea. Nucleic Acids Res. 1999;27:4658–4670. - PMC - PubMed
    1. Auguet JC, Barberan A, Casamayor EO. Global ecological patterns in uncultured archaea. ISME J. 2009;4:182–190. - PubMed
    1. Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol. 2004;14:283–291. - PubMed
    1. Babu MM, Teichmann SA. Evolution of transcription factors and the gene regulatory network in Escherichia coli. Nucleic Acids Res. 2003;31:1234–1244. - PMC - PubMed

Publication types

Substances

LinkOut - more resources