Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Nov 29;113(48):13588-13593.
doi: 10.1073/pnas.1609893113. Epub 2016 Nov 14.

Mapping membrane activity in undiscovered peptide sequence space using machine learning

Affiliations

Mapping membrane activity in undiscovered peptide sequence space using machine learning

Ernest Y Lee et al. Proc Natl Acad Sci U S A. .

Abstract

There are some ∼1,100 known antimicrobial peptides (AMPs), which permeabilize microbial membranes but have diverse sequences. Here, we develop a support vector machine (SVM)-based classifier to investigate ⍺-helical AMPs and the interrelated nature of their functional commonality and sequence homology. SVM is used to search the undiscovered peptide sequence space and identify Pareto-optimal candidates that simultaneously maximize the distance σ from the SVM hyperplane (thus maximize its "antimicrobialness") and its ⍺-helicity, but minimize mutational distance to known AMPs. By calibrating SVM machine learning results with killing assays and small-angle X-ray scattering (SAXS), we find that the SVM metric σ correlates not with a peptide's minimum inhibitory concentration (MIC), but rather its ability to generate negative Gaussian membrane curvature. This surprising result provides a topological basis for membrane activity common to AMPs. Moreover, we highlight an important distinction between the maximal recognizability of a sequence to a trained AMP classifier (its ability to generate membrane curvature) and its maximal antimicrobial efficacy. As mutational distances are increased from known AMPs, we find AMP-like sequences that are increasingly difficult for nature to discover via simple mutation. Using the sequence map as a discovery tool, we find a unexpectedly diverse taxonomy of sequences that are just as membrane-active as known AMPs, but with a broad range of primary functions distinct from AMP functions, including endogenous neuropeptides, viral fusion proteins, topogenic peptides, and amyloids. The SVM classifier is useful as a general detector of membrane activity in peptide sequences.

Keywords: antimicrobial peptides; cell-penetrating peptides; machine learning; membrane curvature; membrane permeation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
SVM learning and Pareto-optimization select for antimicrobial and membrane curvature-generating peptides. (A) Schematic depicting the use of an SVM binary classifier to partition hypothetical antimicrobial peptide sequences (blue circles) described by the n = 2 descriptors {ϕij} from nonantimicrobial sequences (red circles) using an (n − 1)-dimensional maximum-margin linear hyperplane. The support vectors are the sequences lying on the margins. The separating hyperplane lies midway between the margins. The metric σ (green arrows) indicates the distance to hyperplane for each peptide. Positive distances denote antimicrobial sequences whereas negative distances denote nonantimicrobial sequences. (B) Schematic demonstrating separation of Pareto-optimal sequences (green circles) from dominated sequences (gray circles) in an arbitrary 3D subspace of descriptors. The Pareto frontier is the hypersurface containing the Pareto-optimal sequences. (C) Common biologically relevant manifestations of negative Gaussian curvature generation in cell membranes, including (C, 1) blebbing, (C, 2) pore formation, and (C, 3) scission and budding.
Fig. 2.
Fig. 2.
Sequence atlas and Pareto frontier constructed by directed sampling of sequence space. Embedding of the 242,110 peptides generated by our directed sequence space search into the 3D space spanned by (i) predicted helical content, (ii) Jukes–Cantor distance to known AMPs, and (iii) distance to hyperplane (σ). Sequences with σ > 0 are predicted by the classifier to be antimicrobial or membrane-active, whereas those with σ < 0 are not. The more positive σ becomes, the higher probability of being antimicrobial P(+1). The orange diamonds pick out the 85 peptides lying on the physicochemically unrestricted Pareto frontier in which we place no restriction on the value of the descriptors generated for these candidates. Green diamonds highlight the 13 peptides on the physicochemically restricted Pareto frontier in which the descriptors are restricted to lie no more than 10% outside the range observed in the training data. Red stars are the 16 peptides proximate to the frontiers that were selected for testing.
Fig. 3.
Fig. 3.
Synthesized test peptides from directed Monte Carlo search and SVM screening generate negative Gaussian curvature in model membranes. (A) Representative SAXS data of four test peptides indicate ability to generate negative Gaussian curvature in model bacterial membranes. Peaks with cubic symmetry are labeled according to their x coordinates √(h2 + k2 + l2) in B. Unlabeled peaks correspond to coexisting lamellar and/or hexagonal phases induced by peptides. (Inset) Local topology of saddle-splay curvature. (B) Linear fits indicating the q positions of the Bragg peaks with cubic symmetry, their respective Miller indices (hkl), their respective space groups, and resulting lattice parameter a. (C) Contour surface representation of the Pn3m space group. (D) Contour surface representation of the Im3m space group.
Fig. 4.
Fig. 4.
Distance to hyperplane of test peptides does correlate with strength of negative Gaussian curvature. There is no significant correlation between the magnitude of NGC generation and homology of test peptides (n = 16) to known membrane-active peptides (A, RSpearman = 0.155 [−0.425, 0.736], P = 0.155), but there is a statistically significant (B, RSpearman = 0.653 [0.234, 0.891], P = 0.006) positive correlation between the magnitude of NGC generation and distance to hyperplane σ, as well as the probability of being antimicrobial (C, RSpearman = 0.653 [0.231, 0.896], P = 0.006). This validates the use of σ as a proxy for optimization of curvature generation as opposed to antimicrobial efficacy (SI Appendix, Fig. S4).
Fig. 5.
Fig. 5.
Directed search of the sequence space discovers diverse families of membrane curvature-generating peptides. We visualize the 2D projection of the 242,110 candidate peptides generated by directed sampling of sequence space (Fig. 2) into distance to-hyperplane σ and Jukes–Cantor distance to known AMPs and supplemented by the 31 sequences belonging to diverse peptide families listed in SI Appendix, Table S7. To guide the interpretation of the discovered membrane-active sequences, we highlight the physicochemically restricted (13 peptides, green diamonds) and unrestricted Pareto frontiers (85 peptides, orange diamonds) For reference, the peptides experimentally tested are also shown (16 peptides, red stars). Screening of a variety of protein families yields sequences with predicted σ > 0 near the physicochemically unrestricted Pareto frontier. These sequences span a variety of protein families, including neuropeptides (purple stars), calcitonin peptides (black stars), viral fusion proteins (yellow stars), membrane anchor proteins (light green stars), membrane-permeating protein fragments (blue stars), and topogenic peptides (pink stars). Some of the proteins have unexpected predicted membrane activity, whereas others have confirmed experimental evidence for membrane permeation. In fact, these other classes of peptides are expected to be just as membrane-active as AMPs. This diversity demonstrates the power of the SVM-directed search framework as a tool for discovery of new membrane reorganizing protein sequences.

References

    1. Zasloff M. Antimicrobial peptides of multicellular organisms. Nature. 2002;415(6870):389–395. - PubMed
    1. Shai Y. Mechanism of the binding, insertion and destabilization of phospholipid bilayer membranes by α-helical antimicrobial and cell non-selective membrane-lytic peptides. Biochim Biophys Acta. 1999;1462(1-2):55–70. - PubMed
    1. Brogden KA. Antimicrobial peptides: Pore formers or metabolic inhibitors in bacteria? Nat Rev Microbiol. 2005;3(3):238–250. - PubMed
    1. Hancock REW, Lehrer R. Cationic peptides: A new source of antibiotics. Trends Biotechnol. 1998;16(2):82–88. - PubMed
    1. Hancock REW, Sahl H-G. Antimicrobial and host-defense peptides as new anti-infective therapeutic strategies. Nat Biotechnol. 2006;24(12):1551–1557. - PubMed

Publication types

MeSH terms