Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Nov 10:2024.11.10.622515.
doi: 10.1101/2024.11.10.622515.

ComFB, a new widespread family of c-di-NMP receptor proteins

Affiliations

ComFB, a new widespread family of c-di-NMP receptor proteins

Sherihan Samir et al. bioRxiv. .

Update in

  • ComFB, a widespread family of c-di-NMP receptor proteins.
    Samir S, Elshereef AA, Alva V, Hahn J, Eck F, Celma L, Lopes ES, Thormann K, Dubnau D, Galperin MY, Selim KA. Samir S, et al. Proc Natl Acad Sci U S A. 2025 Sep 23;122(38):e2513041122. doi: 10.1073/pnas.2513041122. Epub 2025 Sep 18. Proc Natl Acad Sci U S A. 2025. PMID: 40966295 Free PMC article.

Abstract

Cyclic dimeric GMP (c-di-GMP) is a widespread bacterial second messenger that controls a variety of cellular functions, including protein and polysaccharide secretion, motility, cell division, cell development, and biofilm formation, and contributes to the virulence of some important bacterial pathogens. While the genes for diguanylate cyclases and c-di-GMP hydrolases (active or mutated) can be easily identified in microbial genomes, the list of c-di-GMP receptor domains is quite limited, and only two of them, PliZ and MshEN, are found across multiple bacterial phyla. Recently, a new c-di-GMP receptor protein, named CdgR or ComFB, has been identified in cyanobacteria and shown to regulate their cell size and, more recently, natural competence. Sequence and structural analysis indicated that CdgR is part of a widespread ComFB protein family, named after the "late competence development protein ComFB" from Bacillus subtilis. This prompted the suggestion that ComFB and ComFB-like proteins could also be c-di-GMP receptors. Indeed, we revealed that ComFB proteins from Gram-positive B. subtilis and Thermoanaerobacter brockii were able to bind c-di-GMP with high-affinity. The ability to bind c-di-GMP was also demonstrated for the ComFB proteins from clinically relevant Gram-negative bacteria Vibrio cholerae and Treponema denticola. These observations indicate that the ComFB family serves as yet another widespread family of bacterial c-di-GMP receptors. Incidentally, some ComFB proteins were also capable of c-di-AMP binding, identifying them as a unique family of c-di-NMP receptor proteins. The overexpression of comFB in B. subtilis, combined with an elevated concentration of c-di-GMP, suppressed motility, attesting to the biological relevance of ComFB as a c-di-GMP binding protein.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Sequence and structural conservation within the ComFB superfamily.
(A) Structural alignment of the dimeric forms of Bacillus subtilis ComFB (BsComFB, PDB: 4WAI, yellow and teal) and CdgR from Synechocystis sp. PCC 6803 (PDB: 8HJA, orange and red). The CdgR-bound c-di-GMP molecules are shown in stick mode with carbon atoms in blue. The c-di-GMP binding residues D53, N100, R101 and Y115 of CdgR are shown in stick mode with carbon atoms in green. (B) Sequence alignment of representative members of the ComFB superfamily. Proteins are shown under their UniProt identifiers, and secondary structure assignments (H, α-helix, E, β-strand) of ComFB and CdgR are shown with their PDB codes. The numbers indicate the positions of the start and end of the alignment and the lengths of the gaps between the aligned blocks. Conserved negatively (D, E) and positively (K, R) charged residues are shown in red and blue, respectively; nonpolar hydrophilic residues (N, Q, S, T) are in purple. Conserved hydrophobic residues are indicated with yellow shading, and conserved turn residues (G, P, S, A) are shaded green. Zinc-binding Cys residues of ComFB and the conserved Cys residues in other proteins are shown on a light blue background. The last sequence in the upper block represents the Pfam entry PF10719. The symbols in the “Function” line indicate (as specified by Zeng et al., 2023): d, residues responsible for protein dimerization; asterisks, residues involved in binding c-di-GMP; h and π, residues involved in hydrophobic interactions with the c-di-GMP ligand. The lower block represents ComFB-related sequences that are not recognized by the PF10719 sequence model; its top two lines display secondary structure predictions by AlphaFold (Varadi et al., 2022) and HHpred (Zimmermann et al., 2018). The last three lines show sequences of the N-terminal ComFB domains of four-domain diguanylate cyclases. The sequences in the upper block are from the following organisms: ComFB, Bacillus subtilis; Q8YS15, Nostoc sp. PCC 7120; P74113, Synechocystis sp. PCC 6803 (both, cyanobacteria); E4S4A5, Caldicellulosiruptor acetigenus; Q0AV46, Syntrophomonas wolfei; E8USF0, Thermoanaerobacter brockii (all three, Clostridia); D3P9N1, Deferribacter desulfuricans (Deferribacterota); H2J3K4, Marinitoga piezophila (Thermotogota); Q73MV1, Treponema denticola; H9UJ27, Spirochaeta africana; D5U3J8, Brachyspira murdochii (all three, Spirochaetota); Q21X05, Albidiferax ferrireducens; A2SG25, Methylibium petroleiphilum (both, Betaproteobacteria); Q9KM28, Vibrio cholerae; Q8EG00, Shewanella oneidensis (both, Gammaproteobacteria); Q728W3, Desulfovibrio vulgaris; Q313N7, Oleidesulfovibrio alaskensis (both, Thermodesulfobacteriota). All sequences in the lower block are from cyanobacteria. (C) Cluster map of ComFB homologs. A set of 1,626 representative ComFB sequences (≤ 70% pairwise identity and ≥ 70% length coverage) was clustered using the CLANS tool (Frickey & Lupas, 2004) based on pairwise BLAST P-values. Dots represent individual sequences, colored according to their group. Line color intensity reflects sequence similarity, with darker lines indicating higher similarity. The analysis revealed four clusters: two within Cyanobacteriota, one comprising diverse phyla (e.g., Actinomycetota, Bacillota), and a distinct Pseudomonadota cluster, highlighting conserved c-di-GMP-binding residues across these diverse groups.
Figure 1.
Figure 1.. Sequence and structural conservation within the ComFB superfamily.
(A) Structural alignment of the dimeric forms of Bacillus subtilis ComFB (BsComFB, PDB: 4WAI, yellow and teal) and CdgR from Synechocystis sp. PCC 6803 (PDB: 8HJA, orange and red). The CdgR-bound c-di-GMP molecules are shown in stick mode with carbon atoms in blue. The c-di-GMP binding residues D53, N100, R101 and Y115 of CdgR are shown in stick mode with carbon atoms in green. (B) Sequence alignment of representative members of the ComFB superfamily. Proteins are shown under their UniProt identifiers, and secondary structure assignments (H, α-helix, E, β-strand) of ComFB and CdgR are shown with their PDB codes. The numbers indicate the positions of the start and end of the alignment and the lengths of the gaps between the aligned blocks. Conserved negatively (D, E) and positively (K, R) charged residues are shown in red and blue, respectively; nonpolar hydrophilic residues (N, Q, S, T) are in purple. Conserved hydrophobic residues are indicated with yellow shading, and conserved turn residues (G, P, S, A) are shaded green. Zinc-binding Cys residues of ComFB and the conserved Cys residues in other proteins are shown on a light blue background. The last sequence in the upper block represents the Pfam entry PF10719. The symbols in the “Function” line indicate (as specified by Zeng et al., 2023): d, residues responsible for protein dimerization; asterisks, residues involved in binding c-di-GMP; h and π, residues involved in hydrophobic interactions with the c-di-GMP ligand. The lower block represents ComFB-related sequences that are not recognized by the PF10719 sequence model; its top two lines display secondary structure predictions by AlphaFold (Varadi et al., 2022) and HHpred (Zimmermann et al., 2018). The last three lines show sequences of the N-terminal ComFB domains of four-domain diguanylate cyclases. The sequences in the upper block are from the following organisms: ComFB, Bacillus subtilis; Q8YS15, Nostoc sp. PCC 7120; P74113, Synechocystis sp. PCC 6803 (both, cyanobacteria); E4S4A5, Caldicellulosiruptor acetigenus; Q0AV46, Syntrophomonas wolfei; E8USF0, Thermoanaerobacter brockii (all three, Clostridia); D3P9N1, Deferribacter desulfuricans (Deferribacterota); H2J3K4, Marinitoga piezophila (Thermotogota); Q73MV1, Treponema denticola; H9UJ27, Spirochaeta africana; D5U3J8, Brachyspira murdochii (all three, Spirochaetota); Q21X05, Albidiferax ferrireducens; A2SG25, Methylibium petroleiphilum (both, Betaproteobacteria); Q9KM28, Vibrio cholerae; Q8EG00, Shewanella oneidensis (both, Gammaproteobacteria); Q728W3, Desulfovibrio vulgaris; Q313N7, Oleidesulfovibrio alaskensis (both, Thermodesulfobacteriota). All sequences in the lower block are from cyanobacteria. (C) Cluster map of ComFB homologs. A set of 1,626 representative ComFB sequences (≤ 70% pairwise identity and ≥ 70% length coverage) was clustered using the CLANS tool (Frickey & Lupas, 2004) based on pairwise BLAST P-values. Dots represent individual sequences, colored according to their group. Line color intensity reflects sequence similarity, with darker lines indicating higher similarity. The analysis revealed four clusters: two within Cyanobacteriota, one comprising diverse phyla (e.g., Actinomycetota, Bacillota), and a distinct Pseudomonadota cluster, highlighting conserved c-di-GMP-binding residues across these diverse groups.
Figure 2.
Figure 2.. Genomic neighborhoods of selected ComFB/CdgR family proteins.
Genomic fragments are listed with the organism names, GenBank accession numbers, and genomic coordinates. Gene sizes are drawn approximately to scale, and gene names are from GenBank, RefSeq, and/or the COG database. ComFB genes are in red, other competence-related genes are in pink, flagella-related genes are in orange, pili-related genes are in green, signal transduction genes are in yellow, metabolic genes are in various shades of blue, poorly characterized genes are in grey or white. The graph displays fragments of the following genomes: A, Bacillus subtilis 168, GenBank accession AL009126: 3,643,558..3,637,338; B, Synehocystis sp. PCC 6803, BA000022: 1,776,983..1,783,355; C, Thermoanaerobacter tengcongensis MB4, AE008691: 1,261,742..1,271,495; D, Allochromatium vinosum DSM 180, CP001896: 3,319,557..3,327,133; E, Desulfohalobium retbaense DSM 5692, CP001734: 792,659..799,553; F, Treponema denticola ATCC 35405, AE017226: 1,452,441..1,444,590; G, Vibrio cholerae O1 biovar El Tor str. N16961 chromosome II, AE003853: 495,055..503,153. The genomic fragments for B. subtilis and T. denticola are shown in reverse complement.
Figure 3.
Figure 3.. Structural gallery of representative ComFB domain-containing proteins from various species.
α-helices in the ComFB domain are colored red, β-strands are in yellow, and the remainder of the protein in grey. For proteins with two ComFB domains, one domain is shown in lighter shades. The structures are AlphaFold2 (Jumper et al., 2021) predictions from UniProt/AlphaFold DB (Varadi et al., 2024), except for Bacillus subtilis (PDB 4WAI). The species represented include Synechocystis sp. PCC 6803 (Slr1970, Sll1170, Slr1505, and Sll1739, UniProt accessions P74113, P74197, P73943, and P73385, respectively), Vibrio cholerae (Q9KM28), Fischerella muscicola (A0A2N6JYB2), Trichormus variabilis (Q3M730), Synechococcus sp. PCC 7502 (K9SSQ8), and Pseudanabaena sp. PCC 7367 (K9SGA4). Several of these proteins also contain additional features, such as N- or C-terminal extensions (e.g., Slr1505), coiled-coil segments (e.g., PspA), or other domains: Sll1739 and Slr1970 have uncharacterized α-helical bundle domains, and Sll1170 contains DUF1816, PAS, and GGDEF domains.
Figure 4.
Figure 4.. Isothermal titration calorimetry (ITC) analysis of c-di-GMP binding to phylogenetically different ComFB proteins.
Upper panels show the raw ITC data in the form of heat produced during the titration of c-di-GMP on different ComFB proteins; lower panels show the binding isotherms and the best-fit curves according to the one binding site model. (A-D) ITC analysis of c-di-GMP binding to B. subtilis or T. brockii ComFB proteins in the absence (A,C) or presence of 150 μM c-di-AMP (B,D). (E,F) ITC analysis of c-di-GMP binding to V. cholerae (E) or T. denticola (F) ComFB proteins.
Figure 5.
Figure 5.. Isothermal titration calorimetry (ITC) analysis of c-di-AMP binding to phylogenetically different ComFB proteins.
Upper panels show the raw ITC data in the form of heat produced during the titration of c-di-GMP on different ComFB proteins; lower panels show the binding isotherms and the best-fit curves according to the one binding site model. (A-D) ITC analysis of c-di-AMP binding to B. subtilis or T. brockii ComFB proteins in the absence (A,C) or presence of 150 μM c-di-GMP (B,D). (E,F) ITC analysis of c-di-AMP binding to V. cholerae (E) or T. denticola (F) ComFB proteins in the presence of 150 μM c-di-GMP.
Figure 6.
Figure 6.. ComFB inhibits swimming.
The swimming assay was conducted as described in Methods by inoculating cells into 0.3% agar LB plates. Plasmid vectors carrying comFB under the control of a constitutive promoter or the same vector without comFB (empty vector) were inserted separately at amyE in wild type and ΔpdeH backgrounds. The image was acquired after 20 hours of growth at 30 C in a humidified chamber, followed by a further 5 hours at 37 C.

References

    1. Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., and Lipman D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402. - PMC - PubMed
    1. Amikam D., and Galperin M.Y. (2006) PilZ domain is part of the bacterial c-di-GMP binding protein. Bioinformatics 22: 3–6. - PubMed
    1. Angerer V., Schwenk P., Wallner T., Kaever V., Hiltbrunner A., and Wilde A. (2017) The protein Slr1143 is an active diguanylate cyclase in Synechocystis sp. PCC 6803 and interacts with the photoreceptor Cph2. Microbiology (Reading) 163: 920–930. - PubMed
    1. Chatterjee D., Cooley R.B., Boyd C.D., Mehl R.A., O’Toole G.A., and Sondermann H. (2014) Mechanistic insight into the conserved allosteric regulation of periplasmic proteolysis by the signaling molecule cyclic-di-GMP. eLife 3: e03650. - PMC - PubMed
    1. Chen Y., Chai Y., Guo J.H., and Losick R. (2012) Evidence for cyclic di-GMP-mediated signaling in Bacillus subtilis. J Bacteriol 194: 5080–5090. - PMC - PubMed

Publication types

LinkOut - more resources