Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 3;13(4):e0195332.
doi: 10.1371/journal.pone.0195332. eCollection 2018.

Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors

Affiliations

Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors

Ernesto Perez-Rueda et al. PLoS One. .

Abstract

Gene regulation at the transcriptional level is a central process in all organisms, and DNA-binding transcription factors, known as TFs, play a fundamental role. This class of proteins usually binds at specific DNA sequences, activating or repressing gene expression. In general, TFs are composed of two domains: the DNA-binding domain (DBD) and an extra domain, which in this work we have named "companion domain" (CD). This latter could be involved in one or more functions such as ligand binding, protein-protein interactions or even with enzymatic activity. In contrast to DBDs, which have been widely characterized both experimentally and bioinformatically, information on the abundance, distribution, variability and possible role of the CDs is scarce. Here, we investigated these issues associated with the domain architectures of TFs in prokaryotic genomes. To this end, 19 families of TFs in 761 non-redundant bacterial and archaeal genomes were evaluated. In this regard we found four main groups based on the abundance and distribution in the analyzed genomes: i) LysR and TetR/AcrR; ii) AraC/XylS, SinR, and others; iii) Lrp, Fis, ArsR, and others; and iv) a group that included only two families, ArgR and BirA. Based on a classification of the organisms according to the life-styles, a major abundance of regulatory families in free-living organisms, in contrast with pathogenic, extremophilic or intracellular organisms, was identified. Finally, the protein architecture diversity associated to the 19 families considering a weight score for domain promiscuity evidenced which regulatory families were characterized by either a large diversity of CDs, here named as "promiscuous" families given the elevated number of variable domains found in those TFs, or a low diversity of CDs. Altogether this information helped us to understand the diversity and distribution of the 19 Prokaryotes TF families. Moreover, initial steps were taken to comprehend the variability of the extra domain in those TFs, which eventually might assist in evolutionary and functional studies.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Proportion of TFs in organisms depending on their lifestyles.
Organisms were classified in either of four categories: intracellular, pathogens, extremophiles, and free-living, in agreement with [18]. TF proportions were calculated as the ratio between the total number of TFs and the genome size (in ORFs). The line shown inside the box is the median value. The whisker caps represent the minimum and maximum values. Points outside the bars represent the outlier genomes.
Fig 2
Fig 2. Distribution of TF families per lifestyle group.
Each column denotes a life style as described in Fig 1, whereas rows denote the 19 TF families analyzed in this work. The heat map bar at the top of the figure indicates the relative abundance of family per lifestyle. Four groups of TF families were identified based in a hierarchical clustering approach by using a Manhattan distance and a supporting tree with an average linkage algorithm, also with a correlation uncentered as a similarity measure. Numbers on top of the TF families denote the proportion of these TFs and the numbers in the upper left section show the weight scores (WS).
Fig 3
Fig 3. Coefficient of Variation per family.
In order to define how complex where the TF in the multiple families TF families were grouped into three classes (indicated by circles) depending on their coefficient of variation (CV) as follows: 0.9–1.36, highly promiscuous; 1.74–2.5, intermediate promiscuity; 3.5–4.5, not promiscuous or monolithic. On the X-axis the CV per family is indicated. On the Y-axis the total of Companion Domains is indicated. CV was determined as the ratio of the standard deviation to the mean for all the WS in the corresponding TF family (for more details please refer to Methods sections).
Fig 4
Fig 4. Architecture of families as a function of the genome size.
Analysis and classification of the multiple TFs classified them in 3 groups as shown in S1 Table and Fig 3. Examples of these three groups are shown as follows: a) highly promiscuous, the AraC/XylS family; b) intermediate promiscuity, the BirA family; and c) not promiscuous or monolithic, the Fur family. On the X-axis of each graph the genome sizes are displayed in eleven windows with a length of 836 ORFs each (See Methods). On the Y-axis, the weight score (WS) is represented (see the Methods section for details). The line shown in the box is the median value. The whisker caps represent the minimum and maximum values.
Fig 5
Fig 5. Proportion of CDs shared between different families.
In order to identify companion domains (CDs) that are common to more than one family of TFs each of the CDs associated to TF families was compared against the other families. Only domains that were identified as enriched were plotted as a heat map (upper section), in which 0 represents absence and 1.0 represents 100% of CDs in common with two or more families.

References

    1. Engstrom MD, Pfleger BF. Transcription control engineering and applications in synthetic biology. Synth Syst Biotechnol. 2017;2(3):176–91 doi: 10.1016/j.synbio.2017.09.003 - DOI - PMC - PubMed
    1. Perez-Rueda E, Martinez-Nunez MA. The repertoire of DNA-binding transcription factors in prokaryotes: functional and evolutionary lessons. Sci Prog. 2012;95(Pt 3):315–29 - PMC - PubMed
    1. Fernandez-Lopez R, Ruiz R, de la Cruz F, Moncalian G. Transcription factor-based biosensors enlightened by the analyte. Front Microbiol. 2015;6:648 doi: 10.3389/fmicb.2015.00648 - DOI - PMC - PubMed
    1. Balderas-Martinez YI, Savageau M, Salgado H, Perez-Rueda E, Morett E, Collado-Vides J. Transcription factors in Escherichia coli prefer the holo conformation. PLoS One. 2013;8(6):e65723 doi: 10.1371/journal.pone.0065723 - DOI - PMC - PubMed
    1. Martinez-Nunez MA, Poot-Hernandez AC, Rodriguez-Vazquez K, Perez-Rueda E. Increments and duplication events of enzymes and transcription factors influence metabolic and regulatory diversity in prokaryotes. PLoS One. 2013;8(7):e69707 doi: 10.1371/journal.pone.0069707 - DOI - PMC - PubMed

Publication types

LinkOut - more resources