Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May;13(5):470-478.
doi: 10.1038/nchembio.2319. Epub 2017 Feb 28.

A new genome-mining tool redefines the lasso peptide biosynthetic landscape

Affiliations

A new genome-mining tool redefines the lasso peptide biosynthetic landscape

Jonathan I Tietz et al. Nat Chem Biol. 2017 May.

Abstract

Ribosomally synthesized and post-translationally modified peptide (RiPP) natural products are attractive for genome-driven discovery and re-engineering, but limitations in bioinformatic methods and exponentially increasing genomic data make large-scale mining of RiPP data difficult. We report RODEO (Rapid ORF Description and Evaluation Online), which combines hidden-Markov-model-based analysis, heuristic scoring, and machine learning to identify biosynthetic gene clusters and predict RiPP precursor peptides. We initially focused on lasso peptides, which display intriguing physicochemical properties and bioactivities, but their hypervariability renders them challenging prospects for automated mining. Our approach yielded the most comprehensive mapping to date of lasso peptide space, revealing >1,300 compounds. We characterized the structures and bioactivities of six lasso peptides, prioritized based on predicted structural novelty, including one with an unprecedented handcuff-like topology and another with a citrulline modification exceptionally rare among bacteria. These combined insights significantly expand the knowledge of lasso peptides and, more broadly, provide a framework for future genome-mining efforts.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Ribosomal natural product (RiPP) biosynthesis and overview of RODEO
(a) General overview of RiPP biosynthesis. (b) Overview of lasso peptide biosynthesis by leader peptidase, lasso cyclase, and RRE. (c) RODEO workflow and output. (d) Comparison of scoring accuracy on a randomly selected training set using heuristic scoring only or scoring with combined motif analysis (MEME) and machine learning (SVM). Sensitivity is represented by recall; specificity is represented by precision. (e) Comparative scoring distribution on the final peptide dataset using either heuristics only or scoring with MEME and SVM integrated. Scoring distributions were practically indistinguishable between training and full data sets (Supplementary Fig. 2). Currently, RODEO is optimized to score potential lasso peptides.
Figure 2
Figure 2. Phylogenetic map of all identified lasso peptides
A SSN of precursor peptides is shown with an E-value cutoff of 10−8. Background shading indicates phylum. Node shading indicates if the NP has been isolated or detected in culture (including this study). Co-occurrence of conserved genes in the local genomic region for the peptides above are indicated in Supplementary Data Sets 1, 3.
Figure 3
Figure 3. Lasso peptide sequence analysis
(a) Lasso peptide structural features and common residues mined from RODEO analysis. (b) Organization of predicted lasso peptides into families. Roughly half of the identified BGCs had identical or closely related clusters in other organisms. (c) Sequence logos for lasso peptide families with >20 members show wide variance in sequence composition and degree of conservation. Numbers in parentheses refer to number of instances found per family. Residues are color-coded according to basic (green), acidic (purple), or hydrophobic (orange) character.
Figure 4
Figure 4. SSNs of lasso cyclase protein
Network was visualized with an edge cutoff E-value threshold of 10−80 with nodes colored by phylum. Location within the network of known lasso peptides (including this study) are indicated.
Figure 5
Figure 5. Lasso peptides discovered via RODEO-based prioritization
(a) Five gene cluster diagrams for the six investigated lasso peptides. (b) Precursor sequences and predicted modified sites from BGCs in (a). (c) NOE-based NMR ensemble structure and schematic diagram of LP2006 showing the looped-handcuff topology. (d) Comparison of LP2006 (class IV) to previous lasso topologies.
Figure 6
Figure 6. Citrulassin, a rare example of bacterial PAD activity
(a) The two-dimensional structure of citrulassin A with Cit9 highlighted. (b) Two-dimensional structure of heterologously expressed des-citrulassin A with Arg9 highlighted. (c) Gene cluster diagrams are shown for two fosmids (3H4 and 1F3) from S. albulus NRRL B-3066 expressed in S. lividans. (d) Production of heterologous des-citrulassin A as indicated by MALDI-TOF MS.

References

    1. Newman DJ, Cragg GM. Natural products as sources of new drugs from 1981 to 2014. J. Nat. Prod. 2016;79:629–661. - PubMed
    1. Winter JM, Behnken S, Hertweck C. Genomics-inspired discovery of natural products. Curr. Opin. Chem. Biol. 2011;15:22–31. - PubMed
    1. Medema MH, et al. Minimum Information about a Biosynthetic Gene cluster. Nat. Chem. Biol. 2015;11:625–31. - PMC - PubMed
    1. Weber T, et al. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res. 2015;43:W237–W243. - PMC - PubMed
    1. Tietz JI, Mitchell DA. Using genomics for natural product structure elucidation. Curr. Top. Med. Chem. 2015;16:1645–1694. - PubMed

ONLINE METHODS REFERENCES

    1. Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016 - PMC - PubMed
    1. Hildebrand A, Remmert M, Biegert A, Soding J. Fast and accurate automatic structure prediction with HHpred. Proteins. 2009;(77 Suppl 9):128–32. - PubMed
    1. Letunic I, Bork P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 2011;39:W475–W478. - PMC - PubMed
    1. Su G, Morris JH, Demchak B, Bader GD. Biological network exploration with Cytoscape 3. Curr Protoc Bioinformatics. 2014;47(813):1–24. - PMC - PubMed
    1. Kohl M, Wiese S, Warscheid B. Cytoscape: software for visualization and analysis of biological networks. Methods Mol. Biol. 2011;696:291–303. - PubMed

Publication types

LinkOut - more resources