Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul;8(7):842-853.
doi: 10.1038/s41551-024-01243-1. Epub 2024 Jul 31.

Deep mutational scanning and machine learning for the analysis of antimicrobial-peptide features driving membrane selectivity

Affiliations

Deep mutational scanning and machine learning for the analysis of antimicrobial-peptide features driving membrane selectivity

Justin R Randall et al. Nat Biomed Eng. 2024 Jul.

Abstract

Many antimicrobial peptides directly disrupt bacterial membranes yet can also damage mammalian membranes. It is therefore central to their therapeutic use that rules governing the membrane selectivity of antimicrobial peptides be deciphered. However, this is difficult even for short peptides owing to the large combinatorial space of amino acid sequences. Here we describe a method for measuring the loss or maintenance of antimicrobial-peptide activity for thousands of peptide-sequence variants simultaneously, and its application to Protegrin-1, a potent yet toxic antimicrobial peptide, to determine the positional importance and flexibility of residues across its sequence while identifying variants with changes in membrane selectivity. More bacterially selective variants maintained a membrane-bound secondary structure while avoiding aromatic residues and cysteine pairs. A machine-learning model trained with our datasets accurately predicted membrane-specific activities for over 5.7 million Protegrin-1 variants, and identified one variant that showed substantially reduced toxicity and retention of activity in a mouse model of intraperitoneal infection. The high-throughput methodology may help elucidate sequence-structure-function relationships in antimicrobial peptides and inform the design of peptide-based synthetic drugs.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare no competing interests.

Figures

Extended Data Fig. 1
Extended Data Fig. 1. Protegin-1 dmSLAY library diversity and sequencing analysis.
a, An alanine scan performed on the native Protegrin-1 (PG-1.0) amino acid sequence showing antibacterial activity (MIC) in μg/ml. Reported MIC is the median of triplicate reactions. b, Chart of the sequence variance found within the Protegrin-1 dmSLAY library. The native Protegrin-1 sequence is shown at the top with mutations observed at each location within the library below. Amino acids are color coded by side chain similarity. Brackets represent where disulfide bonds are present. c, Principal Component Analysis of the overall induced and uninduced triplicate sample variance. d, ROC curve with different MIC cut offs for active and inactivity for log2-fold change cut off across a range of log2-fold change scores (L2FC).
Extended Data Fig. 2
Extended Data Fig. 2. Selectivity of serine and histidine containing PG-1 variants.
a, Scatter plot of the log2-fold change in MIC versus %Hemolysis for dmSLAY active PG-1 variants from Fig. 3. Dotted line represents PG-1.0 selectivity score. b, Table showing the biochemical characteristics of serine and histidine containing Protegrin-1 variants from dmSLAY. MIC is the median of triplicate assays and %Hemolysis is the mean of triplicate assays. c, Bar chart showing the selectivity score of serine and histidine containing variants on a log2 scale. Residue changes are shown below. Brackets show where disulfide bonds are formed in the native structure.
Extended Data Fig. 3
Extended Data Fig. 3. Comparing Protegrin-1 variant activity in mixed cultures.
a-d, Graphs of PG-1 (top left), PG-1.1 (bottom left), PG-1.20 (top right), and PG1–37 (bottom right) percentage of bacterial killing (green) and % hemolysis (purple) with 1 × 109 red blood cells (RBC), 1 × 106 E. coli W3110 cells (Bacteria) or both at various concentrations shown on a log2 scale. Each data point is the mean of triplicate reactions and error bars are one standard deviation.
Extended Data Fig. 4
Extended Data Fig. 4. Training of machine learning models and specific attribute mutational profiles.
a, Precision and recall of for predicting PG-1 variants with an MIC > or < 8 μg/ml. b, predicted versus true hemolysis for trained and test data c, or predicted versus true log10-selectivity score. All models were trained on 80% of data and validated with 20%. Bottom panels: Mutational profiles of variants from 5.7 million candidates with a predicted MIC ≤ 8 μg/ml (a) % hemolysis ≤ 2 (b) or log10-Selectivity score ≤ 0.5 (c).
Fig. 1 |
Fig. 1 |. Protegrin-1 deep mutational SLAY predicts residue importance and flexibility.
a, Surface localized antimicrobial display expresses an OmpA fragment tethering the PG-1 library to the outer membrane (OM). Induction of display results in cell death and lack of propagation for variants maintaining antimicrobial activity (red-active, blue-inactive). change in reads between induced and uninduced cultures can predict antimicrobial potential. b, Scatter plot of variant mean induced versus uninduced reads (n = 3) with a significant log2-fold change (p < 0.5) for the entire PG-1 library on log scales. 1:1 and the native PG-1 sequence (PG-1.0) read ratios are shown. c, Scatter plot of median MIC (n =3) for a subset of 52 variants in the PG-1 library versus the log2-fold change in reads. Optimized cut-offs for binning active and inactive variants are shown as dotted lines. d, Table charting dmSLAY predictions for all single PG-1 variants in the library. The native PG-1 sequence in columns one, amino acid change at each position going across the top categorized by side chain. PG-1 secondary structure and disulfide bonds are diagrammed to the left. Position is colour-coded by log2-fold change. Bold boxed cells were evaluated in vitro. An X marks an incorrect dmSLAY prediction.
Fig. 2 |
Fig. 2 |. Changes in secondary structure correlate with Protegrin-1 lytic activity.
ad, Circular dichroism spectra of select PG-1 variants from the dmSLAY scan colour-coded by minimum inhibitory concentration (a and b) or percent haemolysis (c and d). The native PG-1 sequence (PG-1.0) spectrum is shown as a dotted line. Arrows indicate different trends in cell membrane lysis for PG-1.37 (a, c) or tryptophan containing variants with a maximum near 230 nm. All spectra are the average of technical triplicate.
Fig. 3 |
Fig. 3 |. Membrane selectivity is influenced by aromatics and loss of cysteine pairs.
a, Bar graph comparing selectivity score (median minimum inhibitory concentration * average percent haemolysis) for PG-1 variants confirmed to be antibacterial on a log scale. b, Circular dichroism spectra of same PG-1 variants colour-coded by selectivity score. Spectra are the average of technical triplicate. c, Scatter plot of PG-1 variant aggregation in relative fluorescent units (RFUs) on a log scale versus linear percent haemolysis. Each data point is the mean of triplicate reactions, trendline represents a linear fit of log transformed aggregation data versus haemolysis with R2 = 0.63. d, Bar chart of selectivity score for a subset of mutations found in the four most selective PG-1 variants on a log scale. Brackets indicate where disulfide bonds are present in the native PG-1 sequence.
Fig. 4 |
Fig. 4 |. Protegrin-1 variants demonstrate strong specificity for bacterial membranes.
a, Bar chart of propidium iodide (PI) uptake measured in relative fluorescent units (RFUs) for E. coli cells treated with the indicated concentration of each PG-1 variant. Bars and dots represent the mean of biological triplicate with error bars being one standard deviation b, Kill curve showing colony forming units (CFUs) present over time for cultures treated with 8 μg/ml peptide. c, Percentage of bacterial killing and haemolysis observed in co-culture of E. coli and human red blood cells treated with increasing concentrations of PG-1 variants after one hour of treatment. Data points in b and c are the means of biological triplicate. Error bars represent one standard deviation. Additional plots from individual and co-culture for eachPG-1 variant are included in Supplementary Fig. 4.
Fig. 5 |
Fig. 5 |. Machine learning identifies mutational profiles promoting membrane specificity.
a and c, Venn diagram showing the three machine learning models trained on PG-1 variant data and the cut off used to identify the 95,472 most bacterially specific (a) or 51,876 most mammalian specific (c) from over 5.7 million variants encompassing all possible one, two and three residue mutations. Mutational heat maps charting the number of times each mutation is observed within the bacterially selective group (b) or mammalian selective group (d). The native PG-1 sequence is shown going down the far-left column and specific residue change across the top categorized by side chain. Native PG-1 secondary structure is diagrammed to the left.
Fig. 6 |
Fig. 6 |. Analysis of machine learning performance.
a, Table comparing characteristics of bacterially selective machine learning (bsML) and dmSLAY active PG-1 variants. b, Bar chart of SYTOX Green uptake (membrane lysis) of kidney cells (HEK293) treated with AMPs at three time points, ON = overnight. Bars and dots represent the mean of biological triplicate and error bars one standard deviation. c, Table showing the activity (MIC) and toxicity (MTD) of naturally occurring Protegrin-1 (PG-1A) and bsPG-1.2 with and without C-terminal amidation. d, CFUs present in the organs of CD-1 female mice six hours post intraperitoneal infection by injection of 107 A. baumannii AB5075 CFUs. Mice were immediately treated post infection by injection with either PBS (untreated, n = 5) or 25 mg/kg bsPG-1.2 (n = 6). The limit of detection is indicated by a dashed line. Significance (*) was determined using a multiple Mann-Whitney test.

Update of

References

    1. Mookherjee N, Anderson MA, Haagsman HP & Davidson DJ Antimicrobial host defence peptides: functions and clinical potential. Nature Reviews Drug Discovery 19, 311–332 (2020). - PubMed
    1. Huang Y, Huang J. & Chen Y. Alpha-helical cationic antimicrobial peptides: Relationships of structure and function. Protein and Cell 1, 143–152 (2010). - PMC - PubMed
    1. Fowler DM & Fields S. Deep mutational scanning: A new style of protein science. Nature Methods 11, (2014). - PMC - PubMed
    1. Koch P. et al. Optimization of the antimicrobial peptide Bac7 by deep mutational scanning. BMC Biol. 20, (2022). - PMC - PubMed
    1. Tucker AT et al. Discovery of Next-Generation Antimicrobials through Bacterial Self-Screening of Surface-Displayed Peptide Libraries. Cell 172, 618–628.e13 (2018). - PMC - PubMed

LinkOut - more resources