Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 1;5(5):e00267-20.
doi: 10.1128/mSystems.00267-20.

RRE-Finder: a Genome-Mining Tool for Class-Independent RiPP Discovery

Affiliations

RRE-Finder: a Genome-Mining Tool for Class-Independent RiPP Discovery

Alexander M Kloosterman et al. mSystems. .

Abstract

Many ribosomally synthesized and posttranslationally modified peptide classes (RiPPs) are reliant on a domain called the RiPP recognition element (RRE). The RRE binds specifically to a precursor peptide and directs the posttranslational modification enzymes to their substrates. Given its prevalence across various types of RiPP biosynthetic gene clusters (BGCs), the RRE could theoretically be used as a bioinformatic handle to identify novel classes of RiPPs. In addition, due to the high affinity and specificity of most RRE-precursor peptide complexes, a thorough understanding of the RRE domain could be exploited for biotechnological applications. However, sequence divergence of RREs across RiPP classes has precluded automated identification based solely on sequence similarity. Here, we introduce RRE-Finder, a new tool for identifying RRE domains with high sensitivity. RRE-Finder can be used in precision mode to confidently identify RREs in a class-specific manner or in exploratory mode to assist in the discovery of novel RiPP classes. RRE-Finder operating in precision mode on the UniProtKB protein database retrieved ∼25,000 high-confidence RREs spanning all characterized RRE-dependent RiPP classes, as well as several yet-uncharacterized RiPP classes that require future experimental confirmation. Finally, RRE-Finder was used in precision mode to explore a possible evolutionary origin of the RRE domain. The results suggest RREs originated from a co-opted DNA-binding transcriptional regulator domain. Altogether, RRE-Finder provides a powerful new method to probe RiPP biosynthetic diversity and delivers a rich data set of RRE sequences that will provide a foundation for deeper biochemical studies into this intriguing and versatile protein domain.IMPORTANCE Bioinformatics-powered discovery of novel ribosomal natural products (RiPPs) has historically been hindered by the lack of a common genetic feature across RiPP classes. Herein, we introduce RRE-Finder, a method for identifying RRE domains, which are present in a majority of prokaryotic RiPP biosynthetic gene clusters (BGCs). RRE-Finder identifies RRE domains 3,000 times faster than current methods, which rely on time-consuming secondary structure prediction. Depending on user goals, RRE-Finder can operate in precision mode to accurately identify RREs present in known RiPP classes or in exploratory mode to assist with novel RiPP discovery. Employing RRE-Finder on the UniProtKB database revealed several high-confidence RREs in novel RiPP-like clusters, suggesting that many new RiPP classes remain to be discovered.

Keywords: RRE; RiPPs; Web tool; bioinformatics; genome mining; natural products; secondary metabolism.

PubMed Disclaimer

Figures

FIG 1
FIG 1
RRE-dependent RiPP biosynthesis. (A) RiPP BGCs encode one or more short precursor peptides; their genes often lie adjacent to those for the modifying enzymes, leader peptidases, and proteins for immunity and export (often ABC transporters). RRE domains are found as discrete polypeptides or fused to larger biosynthetic proteins. (B) Modifying proteins bind the leader region of the precursor peptide using RRE domains. Posttranslational modifications are then installed on the core region of the precursor peptide.
FIG 2
FIG 2
RRE-Finder employs two modes for RRE detection. Precision mode (top) uses a set of pHMMs to accurately predict RREs. These pHMMs are based on characterized RRE domains for individual RiPP classes, either from published data sets or from the MIBiG database. Exploratory mode uses a combination of pHMMs and a truncated HHpred pipeline (including secondary-structure prediction) to facilitate the identification of divergent RRE sequences (albeit with a higher false-positive rate).
FIG 3
FIG 3
MIBiG validation of RRE-Finder. Both modes were used to retrieve RRE-containing proteins in 242 RiPP BGCs (A and B) and 1,575 non-RiPP BGCs (C and D) from the MIBiG database. With increasing bit score stringency, the number of RREs detected decreased in both types of BGCs (A and C). At a bit score of 25, exploratory mode of RRE-Finder detected most of the RREs found by precision mode in RiPP BGCs (B), as well as several other RREs. However, the number of RREs detected in non-RiPP BGCs was lower for precision mode than exploratory mode (D).
FIG 4
FIG 4
Summary of proteins retrieved from UniProtKB using precision mode. The numbers of proteins retrieved from the UniProtKB database are summarized for several classes of RiPPs. A scan of the entire UniProtKB database of nonredundant proteins was carried out at three bit scores. In cases where a given UniProt entry was retrieved by more than one precision model (due to partial model redundancy), the protein was counted only toward the model of higher significance. For classes with more than one precision-mode pHMM (e.g., LAPs and sactipeptides), the numbers presented are the sum of proteins retrieved by each individual model. Full data on proteins detected by each precision mode model are available in Data Set S3 (https://figshare.com/articles/Dataset_S3_RRE_domains/12568193). LAP, linear azol(in)e-containing peptide; PQQ, pyrroloquinoline quinone.
FIG 5
FIG 5
Sequence similarity network of UniProtKB proteins retrieved by precision mode. Shown is a RepNode60 SSN at an alignment score of 22 (sequences with >60% identity are conflated to a single node, and edges represent a BLAST expectation value better than 10−22). Proteins are colored based on the best-fit model by which they were detected. White nodes in region 3 represent proteins that were retrieved by the discrete lasso peptide RRE model but do not co-occur with the requisite leader peptidase and lasso cyclase. These proteins represent possible false positives from this model. The discrete lasso peptide RREs clustering with sactipeptides and ranthipeptides in region 2 are discretely encoded RRE proteins that co-occur with radical SAM enzymes. The SSN was generated using the Enzyme Similarity Tool (https://efi.igb.illinois.edu/efi-est/) (26).

References

    1. NCBI Resource Coordinators. 2015. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 43:D6–D17. doi:10.1093/nar/gku1130. - DOI - PMC - PubMed
    1. Arnison PG, Bibb MJ, Bierbaum G, Bowers AA, Bugni TS, Bulaj G, Camarero JA, Campopiano DJ, Challis GL, Clardy J, Cotter PD, Craik DJ, Dawson M, Dittmann E, Donadio S, Dorrestein PC, Entian KD, Fischbach MA, Garavelli JS, Goransson U, Gruber CW, Haft DH, Hemscheidt TK, Hertweck C, Hill C, Horswill AR, Jaspars M, Kelly WL, Klinman JP, Kuipers OP, Link AJ, Liu W, Marahiel MA, Mitchell DA, Moll GN, Moore BS, Muller R, Nair SK, Nes IF, Norris GE, Olivera BM, Onaka H, Patchett ML, Piel J, Reaney MJ, Rebuffat S, Ross RP, Sahl HG, Schmidt EW, Selsted ME, et al. . 2013. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat Prod Rep 30:108–160. doi:10.1039/c2np20085f. - DOI - PMC - PubMed
    1. Hudson GA, Mitchell DA. 2018. RiPP antibiotics: biosynthesis and engineering potential. Curr Opin Microbiol 45:61–69. doi:10.1016/j.mib.2018.02.010. - DOI - PMC - PubMed
    1. Cox CL, Doroghazi JR, Mitchell DA. 2015. The genomic landscape of ribosomal peptides containing thiazole and oxazole heterocycles. BMC Genomics 16:778. doi:10.1186/s12864-015-2008-0. - DOI - PMC - PubMed
    1. Zhang Q, Yu Y, Velasquez JE, van der Donk WA. 2012. Evolution of lanthipeptide synthetases. Proc Natl Acad Sci U S A 109:18361–18366. doi:10.1073/pnas.1210393109. - DOI - PMC - PubMed

LinkOut - more resources