Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Nov 11;6(11):e1000989.
doi: 10.1371/journal.pcbi.1000989.

Local gene regulation details a recognition code within the LacI transcriptional factor family

Affiliations

Local gene regulation details a recognition code within the LacI transcriptional factor family

Francisco M Camas et al. PLoS Comput Biol. .

Abstract

The specific binding of regulatory proteins to DNA sequences exhibits no clear patterns of association between amino acids (AAs) and nucleotides (NTs). This complexity of protein-DNA interactions raises the question of whether a simple set of wide-coverage recognition rules can ever be identified. Here, we analyzed this issue using the extensive LacI family of transcriptional factors (TFs). We searched for recognition patterns by introducing a new approach to phylogenetic footprinting, based on the pervasive presence of local regulation in prokaryotic transcriptional networks. We identified a set of specificity correlations--determined by two AAs of the TFs and two NTs in the binding sites--that is conserved throughout a dominant subgroup within the family regardless of the evolutionary distance, and that act as a relatively consistent recognition code. The proposed rules are confirmed with data of previous experimental studies and by events of convergent evolution in the phylogenetic tree. The presence of a code emphasizes the stable structural context of the LacI family, while defining a precise blueprint to reprogram TF specificity with many practical applications.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. HTH binding mode.
A) X-ray model for a LacI dimer bound to a palindromic BS (plotted with Jmol from the PDB structure 1lbg4). Only the binding domain of each monomer is shown (in light/dark purple, respectively). The hinge-helix and the recognition helix of each monomer are colored in yellow and red, respectively. B) Logo for the alignment of 2639 unredundant HTH-LacI domains. The AA coordinates of any particular domain will be referred by its position in this alignment –they match the numbering of the first 71 AAs of Escherichia coli's GalR and GalS regulators. Helix-1, helix-2 (or recognition helix) and the intermediate residues constitute the HTH motif itself. C) Logo for the alignment of the set of BSs associated to 370 LacI family members (BS sequences from RegTransBase [46]). In BS logos we avoided subscripts for left and right half sites coordinates.
Figure 2
Figure 2. Autoregulation and the search of conserved binding sequences.
A) Local regulation at the core of phylogenetic footprinting includes both autoregulation –which can be linked to the regulation of an upstream divergent operon– and downstream unidirectional adjacent regulation (BSs, white boxes). Red and green lines for the respective strict and extended regions of BS search. B–D) Examples of BS logos. Rest of cases in the Appendix of Text S1. Above each logo: the recognition sequence (AA-15, AA-16) and a triad of numbers (i/ii/iii) corresponding to i) the total number of TFs exhibiting the recognition sequence, ii) the number of TFs for which at least a BS was found, and iii) the total number of found BSs. E) Consensus-logo for the BSs associated to the TVSR group. The inserted position NT-2bis for the YQ-logo in C) has not been considered to build the consensus.
Figure 3
Figure 3. Degeneracies in TF binding.
A) Palindromic (P1 and P2) and non-palindromic BSs (M1 and M2). Nucleotides (NTs) in positions 4 and 5 in both half sites and strands were only considered. Colors distinguished different NTs pairs. Only the sense strand (black line) is included in the alignment of BSs. Half sites separated by dots. B) Scenarios for degeneracy. Spheres represent two different TFs sharing the same recognition amino acids. Arrows indicate what BSs they can bind. We considered both mixtures to have the same binding energy and termed them simply as M. See main text for details. C) Notation criterion for degeneracies uses different arrows between the corresponding left semisequences formula image in the sense strand of the palindromic combinations.
Figure 4
Figure 4. Recognition code and experimental confirmations.
A) Sequence correlations between (AA-15, AA-16) and (NT-5, NT-4) extracted from correlations in Table S1. AAs sequences recognizing a same sequence of NTs were grouped. Here, we only considered significant palindromic NT sequences; for example, (NT-5, NT-4) = TG means formula image. We included the case for (AA-15, AA-16) = YQ corresponding to the synthetic SymL site in C). Recognition degeneracies are represented as unidirectional arrows (asymmetrical intrinsic), bidirectional divergent arrows (symmetrical intrinsic), and bidirectional convergent arrows (extrinsic). Colors for polar (green), basic (blue), acidic (red) and hydrophobic (black) amino acids. B) Agreement between synthetic and natural data. Recognition of (NT-5, NT-4)-palindromes by different (AA-15, AA-16)-LacI mutants (YQ is the wild type). Data from –from which we only considered those sequences (AA-15, AA-16) with a natural correspondence in Table S1. Rest of BS positions as in SymL. The larger the TF/BS affinity, the stronger the repression of the formula image-galactosidase activity. Experimental conditions limited repression to a factor of 200. Arrows indicated again degeneracy classes. Predictions for wild type YQ correspond to asymmetric natural BSs (see text). (NT-5, NT-4)-palindromes involved in the predicted correlations for PM (formula image, see Table S1) lack an experimental test. Accordingly, PM do not exhibit a strong affinity for any of the tested palindromes (see Fig. S3), C) Natural and synthetic operators. A dot distinguishes the half sites. Flanking nucleotides separated by a space to help visualization of the highly conserved central region of the BSs. Colors identify different palindromic or mixed combinations in the specificity nucleotides (see Table S2 for more details).
Figure 5
Figure 5. Convergence of binding modes in the gene tree.
Gene tree involving all TFs with BSs in Table S1 (623 TFs) plus the 3 TFs with formula image binding to natural SymL-like BSs (Fig. 4.C and Table S2). Only one BS per TF is shown. The external color code displays the specificity-associated positions –to help visualization of palindromic combinations right positions are read in the complementary (c) strand: formula image. The color background in several branches corresponds to different recognition AAs (only a few recognition classes were enhanced). External color code in these branches shows darker colors to help visualization. Dots in branches denote bootstrap values larger than 80 (for 100 trees total, see Fig. S4 for more details).

Similar articles

Cited by

References

    1. Seeman NC, Rosenberg JM, Rich A. Sequence-specific recognition of double helical nucleic acids by proteins. Proc Natl Acad Sci U S A. 1976;73:804–808. - PMC - PubMed
    1. Pabo CO, Sauer RT. Protein-DNA recognition. Annu Rev Biochem. 1984;53:293–321. - PubMed
    1. Desjarlais JR, Berg JM. Toward rules relating zinc finger protein sequences and DNA binding site preferences. Proc Natl Acad Sci U S A. 1992;89:7345–7349. - PMC - PubMed
    1. Suzuki M, Brenner SE, Gerstein M, Yagi N. DNA recognition code of transcription factors. Protein Eng Des Sel. 1995;8:319–328. - PubMed
    1. Choo Y, Klug A. Physical basis of a protein-DNA recognition code. Current Opinion In Struct Biol. 1997;7:117–125. - PubMed

Publication types