Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 18;91(6):e0055725.
doi: 10.1128/aem.00557-25. Epub 2025 May 30.

Genome-wide investigation of outer membrane protein families under mosaic evolution in Escherichia coli

Affiliations

Genome-wide investigation of outer membrane protein families under mosaic evolution in Escherichia coli

Xin Cao et al. Appl Environ Microbiol. .

Abstract

Several genes in Gram-negative bacteria encoding outer membrane proteins (OMPs) have been reported to show patterns of mosaic evolution featured by a mixture of negative selection and local recombination. Here, we improved a strategy and applied it to screen OMPs under mosaic evolution in Escherichia coli at the genome level. In total, 21 OMP families, including 16 new ones, were detected with the typical patterns of mosaic evolution. An absolute majority of the protein families are conserved in E. coli for the composition, genomic loci, and the overall structures. Highly variable regions (HVRs) can be recognized, which are frequently located extracellularly within the protruding loops. There is only a limited number of major HVR sequence types, within which positively selected sites can be detected occassionally. Based on the simulated results of multiple models, the OMPs under mosaic evolution are often with good antigenicity, with HVRs of various sequence types coinciding with the B-cell epitopes of the strongest immunogenicity. The study further broadened our understanding of the characteristics of mosaic evolution and the functions of OMPs in Gram-negative bacteria, laying an important foundation for their potential translational applications.IMPORTANCEIt is important to understand the evolutionary mechanisms of bacterial OMP-encoding genes, which would facilitate the development of anti-bacterial reagents. This study made the first genome-wide screening of bacterial OMPs under mosaic evolution and increased the list of candidate OMP families by threefold in E. coli, far more than we expected. The study further confirmed the hypothesis about the evolutionary, micro-evolutionary, and structural features of these OMPs and facilitated the functional theory of mosaic evolution. Moreover, the findings of limited HVR sequence types and strong immunogenicity of HVRs paved an important foundation for the application of these OMPs and their HVRs in the development of antibodies or other antibacterial treatment.

Keywords: Escherichia coli; highly variable region (HVR); mosaic evolution; outer membrane protein (OMP); positive selection.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
Screening of E. coli OMPs under mosaic evolution. (A) The OMPs and candidates under mosaic evolution detected from each step. (B) Genetic exchange test and phylogenetic clustering for FadL protein families. The genetic distance among E. coli strains within each lineage was shown and compared to that among strains from different lineages, respectively. Mann-Whitney U tests were performed, and P-values larger than 0.05 are highlighted in red. The significant P-values but with larger intra-lineage distance are shown in blue. A-A, B1-B1, B2-B2, D1-D1, D2-D2, or E-E represent strain pairs from the same lineage: A, B1, B2, D1, D2, or E, respectively. “x-y” represents pairs of strains from different lineages, e.g., A-B1, A-B2, D1-D2, etc. The P-values for core-proteome (CP_p) were shown at the bottom. FadL proteins were clustered by the neighbor-joining tree. The robustness of the phylogenetic tree was examined by bootstrapping tests with 1,000 replicates, and the scores are indicated for nodes of no lower than 50%. The strains from the same E. coli, Shigella, or E. fergusonii lineage are labeled with a unique colored sign. The diagram for core-proteome (CP) tree referred to (24) and was also shown. (C) Genetic exchange tests and phylogenetic clustering for other representative OMP families. The different phylogenetic clusters were shown in rectangles with a gray background. (D) Distribution of three indices reflecting the local variation property of the FadL protein family in E. coli. HVR analysis was used for the analysis with the window width and step size set as 5 and 1, respectively. The significant highly variable sites (HVS) were shown in red circles.
Fig 2
Fig 2
Conservation of E. coli OMPs under mosaic evolution. (A) Distribution of OMPs under mosaic evolution in E. coli strains. Missing a gene in a strain was shown in white, while the presence was shown in different colors for different lineages. Genes were grouped according to their main functions, including beta-barrel proteins, flagella and fimbriae components, adhesins and membrane-attaching antigens, and other transmembrane functional factors. (B) Collinearity of the fadL gene and the flanking nucleotide sequences (25 kb at each side) between E. coli strains from different lineages. (C) The tertiary structures of FadL proteins of representative E. coli strains from different sequence clusters. The N-terminal 1-25 residues for each FadL protein that were predicted to be signal peptide by SignalP 6.0 were removed. The sequence types of the FadL HVRs were indicated under each structure. The detailed sequences for the HVRs were annotated in Supplemental Data set S1.
Fig 3
Fig 3
Transmembrane topology and tertiary structure of the E. coli OMPs under mosaic evolution. (A) Transmembrane topology diagrams and the location of HVRs for the 21 E. coli OMPs under mosaic evolution. The signal peptides were predicted with SignalP 6.0 and shown in green. The transmembrane topology was predicted with DeepTMHMM for the mature proteins with signal peptides removed, and the extracellular segments were shown in blue. The HVRs identified by HVR analysis and shown in red. The DeepTMHMM prediction results were corrected with the tertiary structure that was experimentally resolved or predicted. The adhesins were annotated manually from literature but not predicted to be extracellular segments by DeepTMHMM, and they were also corrected and shown in purple. The positions indicated for the start and end for each segment were based on the full-length proteins with signal peptides. (B–C) The tertiary structure of FadL (B) and OmpN (C), and the location of HVRs. The signal peptides were removed before the structure modeling and illustration. The E. coli MG1655 proteins were used for all the analysis and structure illustration.
Fig 4
Fig 4
Distribution of HVR sequence types for each OMP under mosaic evolution in E. coli strains. (A) Distribution of HVR sequence types for the 21 E. coli OMPs under mosaic evolution in 3,104 independent E. coli strains. Proportions of newly defined HVR types were shown in red, while the known ones (from the 40 training strains) were shown in blue. (B) Distribution of LamB HVR sequence types and subtypes in the independent E. coli genome set. (C) Distribution of FadL HVR sequence types and subtypes in the independent E. coli genome set. The sequences for the new HVR types and HVR subtypes were shown in red and blue, respectively. The positions were based on the real coordinates of the HVRs within the full-length proteins with signal peptides of E. coli MG1655.
Fig 5
Fig 5
Selection pressure of FadL HVRs in E. coli. (A–E) Omega distribution along the sites of each major HVR sequence type of FadL in E. coli. The sites of significant positive selection were shown in red. (F) Summary of the positively selected sites detected in E. coli FadL HVRs. PAML was used for the selection analysis.
Fig 6
Fig 6
Immunogenicity of E. coli LamB and OmpA HVRs. (A) Antigenicity of LamB of different sequence clusters. LamB, LamB_noHVR, and LamB_len_ctrl represented the wild-type mature LamB with the signal peptides revmoved (residues 1–25 of the original full-length proteins), HVR-removed mature LamB, and mature LamB removed of non-HVR fragments with the total length identical to that of the HVRs (residues 26–44 of the original full-length proteins), respectively. (B) The scores and ranks of ABCpred predicted B-cell epitopes in LamB HVRs. (C) The predicted scores of JBFB on the linear B-cell epitopes in LamB proteins from different sequence clusters. (D) Structural B-cell epitopes in LamB HVRs predicted by DiscoTope-2.0 and CBTOPE. (E) Antigenicity of OmpA of different sequence clusters. OmpA and OmpA_noHVR represented the wild-type mature OmpA (with the signal peptides removed) and HVR-removed mature OmpA, respectively. (F) The scores and ranks of ABCpred predicted B-cell epitopes in OmpA HVRs. (G) The predicted scores of JBFB on the linear B-cell epitopes in OmpA proteins from different sequence clusters. (H) Structural B-cell epitopes in OmpA HVRs predicted by DiscoTope-2.0 and CBTOPE. For (B), (C), (D), (F), and (H), the full-length proteins containing signal peptides were used for analysis and numbering the HVR locations. For JBFB, the window size for the epitopes was set as 20 by default.

References

    1. Antimicrobial Resistance Collaborators . 2022. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399:629–655. - PMC - PubMed
    1. Zampaloni C, Mattei P, Bleicher K, Winther L, Thäte C, Bucher C, Adam J-M, Alanine A, Amrein KE, Baidin V, et al. 2024. A novel antibiotic class targeting the lipopolysaccharide transporter. Nature 625:566–571. doi: 10.1038/s41586-023-06873-0 - DOI - PMC - PubMed
    1. Pahil KS, Gilman MSA, Baidin V, Clairfeuille T, Mattei P, Bieniossek C, Dey F, Muri D, Baettig R, Lobritz M, Bradley K, Kruse AC, Kahne D. 2024. A new antibiotic traps lipopolysaccharide in its intermembrane transporter. Nature 625:572–577. doi: 10.1038/s41586-023-06799-7 - DOI - PMC - PubMed
    1. Wong F, Zheng EJ, Valeri JA, Donghia NM, Anahtar MN, Omori S, Li A, Cubillos-Ruiz A, Krishnan A, Jin W, Manson AL, Friedrichs J, Helbig R, Hajian B, Fiejtek DK, Wagner FF, Soutter HH, Earl AM, Stokes JM, Renner LD, Collins JJ. 2024. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626:177–185. doi: 10.1038/s41586-023-06887-8 - DOI - PMC - PubMed
    1. Wu KJY, Tresco BIC, Ramkissoon A, Aleksandrova EV, Syroegin EA, See DNY, Liow P, Dittemore GA, Yu M, Testolin G, Mitcheltree MJ, Liu RY, Svetlov MS, Polikanov YS, Myers AG. 2024. An antibiotic preorganized for ribosomal binding overcomes antimicrobial resistance. Science 383:721–726. doi: 10.1126/science.adk8013 - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources