Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2006 Jun;70(2):472-509.
doi: 10.1128/MMBR.00046-05.

Cyanobacterial two-component proteins: structure, diversity, distribution, and evolution

Affiliations
Free PMC article
Review

Cyanobacterial two-component proteins: structure, diversity, distribution, and evolution

Mark K Ashby et al. Microbiol Mol Biol Rev. 2006 Jun.
Free PMC article

Abstract

A survey of the already characterized and potential two-component protein sequences that exist in the nine complete and seven partially annotated cyanobacterial genome sequences available (as of May 2005) showed that the cyanobacteria possess a much larger repertoire of such proteins than most other bacteria. By analysis of the domain structure of the 1,171 potential histidine kinases, response regulators, and hybrid kinases, many various arrangements of about thirty different modules could be distinguished. The number of two-component proteins is related in part to genome size but also to the variety of physiological properties and ecophysiologies of the different strains. Groups of orthologues were defined, only a few of which have representatives with known physiological functions. Based on comparisons with the proposed phylogenetic relationships between the strains, the orthology groups show that (i) a few genes, some of them clustered on the genome, have been conserved by all species, suggesting their very ancient origin and an essential role for the corresponding proteins, and (ii) duplications, fusions, gene losses, insertions, and deletions, as well as domain shuffling, occurred during evolution, leading to the extant repertoire. These mechanisms are put in perspective with the different genetic properties that cyanobacteria have to achieve genome plasticity. This review is designed to serve as a basis for orienting further research aimed at defining the most ancient regulatory mechanisms and understanding how evolution worked to select and keep the most appropriate systems for cyanobacteria to develop in the quite different environments that they have successfully colonized.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Phylogenetic tree of the cyanobacterial strains whose genome sequences are available. The tree is adapted from previously published trees based on 16S rRNA sequences (, , , , , ; J. Elhai, personal communication). Names of the marine strains are in blue. Strains able to fix dinitrogen are boxed in red, and a yellow-green motif inside a box indicates that diazotrophy is linked to heterocyst differentiation.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG.2.
FIG.2.
Cyanobacterial two-component ORF repertoire. (A) HKs. (B) RRs. (C) HYs. (D) Others. Subscript numbers following parentheses are the numbers of similar domains that may be found in the proteins listed in a given subclass. Boldface type indicates that orthologues exist in the 16 genomes, and the corresponding phylogenetic trees are shown in Fig. 5. *, proteins with different domain structures belong to this group of orthologues; (formula image), protein that has no bacterial orthologue; (+), putative sequencing error (see below). Putative proteins that have no cyanobacterial orthologue are shown in blue and are identified by their locus tag or GOI. Putative sequencing errors: 7120all1640-1639 (7A instead of 8A would create a frameshift, giving a protein with 95% identity to Avar_400225220); the A of the 7421glr4211 stop codon can serve in the ATG initiation codon for 7421glr4212; Cwat_400841320 could form an HY with Cwat_400841330; and the HATP-RR Cwat_400841330 could form an HYI with Cwat_400841320.
FIG. 3.
FIG. 3.
Cyanobacterial genes encoding two-component system proteins (see Fig. 1 and 2 for acronyms and abbreviations).
FIG. 4.
FIG. 4.
Relative abundance of the two-component system proteins in cyanobacterial genomes. (A) Percentage of total putative two-component system proteins (HK+HY+RR) and transcriptional factors (RRII) as a function of the total coding capacity. (B) Total number of response regulators (RR+HY) versus that of histidine kinases (HK+HY). (C) Number of hybrid kinases as a function of the total coding capacity. For each genome, the total coding capacity is given in Fig. 3.
FIG.5.
FIG.5.
Unrooted trees for orthologous proteins present in the 16 cyanobacterial genomes constructed from the amino acid sequences by the neighbor-joining method with the Phylo_Win program (41). The numbers of sites kept by the program for analysis, after global gap removal, are given in parentheses: PsaA (730), Chk33 (580), Crr1 (212), Crr31 (238), Chk34 (362), Chk2 (187), Crr26 (191), and Crr37 (219). Figures at the nodes correspond to the values produced by bootstrap analysis (1,000 replicates). Abbreviations for species and gene names are given in Fig. 1 and in Table S1s in the supplemental material.
FIG.5.
FIG.5.
Unrooted trees for orthologous proteins present in the 16 cyanobacterial genomes constructed from the amino acid sequences by the neighbor-joining method with the Phylo_Win program (41). The numbers of sites kept by the program for analysis, after global gap removal, are given in parentheses: PsaA (730), Chk33 (580), Crr1 (212), Crr31 (238), Chk34 (362), Chk2 (187), Crr26 (191), and Crr37 (219). Figures at the nodes correspond to the values produced by bootstrap analysis (1,000 replicates). Abbreviations for species and gene names are given in Fig. 1 and in Table S1s in the supplemental material.
FIG.6.
FIG.6.
Parsimonial hypothesis for the origin of the extant cyanobacterial HKs possessing a CheW domain. (A) The consensus phylogenetic tree based on 16S rRNA sequences (Fig. 1) was used as a scaffold. Three gene copies, coded as yellow, dark blue, and white boxes, should have already existed in the ancestor common to the strains that possess such proteins. The actual repertoire is shown to the right, with the domain structure represented by the letters (A to E′) within the boxes and the color code reflecting the present similarities between the proteins (phylogenetic tree presented in Fig. 7). The letters correspond to the different domains present in the proteins, as shown in the diagram in panel B. Gene duplications that occurred during evolution are represented by blue letters within the boxes, and gene losses are indicated by red letters in dotted boxes. Gains of domains within a gene are shown as green letters, and losses are shown as red letters. Gene names corresponding to the boxes on the right are as follows for each strain, from left to right: Avar_400182220, Avar_400177770, and Avar_400215350; 7120all0926, 7120all2161, and 7120all1068; NpunNpF5964, NpunNpF5640, NpunNpF2165, NpunNpR6010, and NpunNpR0245; Tery_403243570, Tery_403235320, and Tery_403260730; 6803slr0322, 6803sll0043, and 6803sll1296; Cwat_400838100, Cwat_400850380, Cwat_400866330, and Cwat_400882520; TBP1tlr0349, TBP1tll0568, and TBP1tll1021; 7942_403099980 and 7942_403098400. (B) Diagram showing the correspondence between letters and domains (see the legend to Fig. 2 and Table S2s, in the supplemental material, for domain abbreviations).
FIG. 7.
FIG. 7.
Unrooted phylogenetic tree constructed from amino acid sequences by the neighbor-joining method with the Phylo_Win program (41). A total of 406 sites were kept by the program for analysis, after global gap removal. Figures at the nodes correspond to the values produced by bootstrap analysis (1,000 replicates). Abbreviations for species and gene names are given in Fig. 1 and in Table S1s in the supplemental material. The proteins belong to the HYVI+CheW subclass and to HKV+CheW for Tery_403260730. Color coding corresponds to that used in Fig. 6.

References

    1. Aiba, H., M. Nagaya, and T. Mizuno. 1993. Sensor and regulator proteins from the cyanobacterium Synechococcus sp. PCC7942 that belong to the bacterial signal-transduction protein families: implication in the adaptive response to phosphate limitation. Mol. Microbiol. 8:81-91. - PubMed
    1. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman.1997. . Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402. - PMC - PubMed
    1. Alves, R., and M. A. Savageau. 2003. Comparative analysis of prototype two-component systems with either bifunctional or monofunctional sensors: differences in molecular structure and physiological function. Mol. Microbiol. 48:25-51. - PubMed
    1. Anantharaman, V., and L. Aravind. 2000. Cache—a signaling domain common to animal Ca(2+)-channel subunits and a class of prokaryotic chemotaxis receptors. Trends Biochem. Sci. 25:535-537. - PubMed
    1. Anantharaman, V., E. V. Koonin, and L. Aravind. 2001. Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains. J. Mol. Biol. 307:1271-1292. - PubMed

LinkOut - more resources