Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 20:9:251.
doi: 10.3389/fmicb.2018.00251. eCollection 2018.

Recovering Genomics Clusters of Secondary Metabolites from Lakes Using Genome-Resolved Metagenomics

Affiliations

Recovering Genomics Clusters of Secondary Metabolites from Lakes Using Genome-Resolved Metagenomics

Rafael R C Cuadrat et al. Front Microbiol. .

Abstract

Metagenomic approaches became increasingly popular in the past decades due to decreasing costs of DNA sequencing and bioinformatics development. So far, however, the recovery of long genes coding for secondary metabolites still represents a big challenge. Often, the quality of metagenome assemblies is poor, especially in environments with a high microbial diversity where sequence coverage is low and complexity of natural communities high. Recently, new and improved algorithms for binning environmental reads and contigs have been developed to overcome such limitations. Some of these algorithms use a similarity detection approach to classify the obtained reads into taxonomical units and to assemble draft genomes. This approach, however, is quite limited since it can classify exclusively sequences similar to those available (and well classified) in the databases. In this work, we used draft genomes from Lake Stechlin, north-eastern Germany, recovered by MetaBat, an efficient binning tool that integrates empirical probabilistic distances of genome abundance, and tetranucleotide frequency for accurate metagenome binning. These genomes were screened for secondary metabolism genes, such as polyketide synthases (PKS) and non-ribosomal peptide synthases (NRPS), using the Anti-SMASH and NAPDOS workflows. With this approach we were able to identify 243 secondary metabolite clusters from 121 genomes recovered from our lake samples. A total of 18 NRPS, 19 PKS, and 3 hybrid PKS/NRPS clusters were found. In addition, it was possible to predict the partial structure of several secondary metabolite clusters allowing for taxonomical classifications and phylogenetic inferences. Our approach revealed a high potential to recover and study secondary metabolites genes from any aquatic ecosystem.

Keywords: NRPS; PKS; environmental genomics; freshwater; metagenomics 2.0.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Abundance of secondary metabolite cluster types obtained with Anti-SMASH in the recovered 288 bins (environmental genomes). (B) Taxonomical classification of bins (Phyla) in which NRPS, PKS, and Hybrid PKS/NRPS clusters were found. Red bar and pie: NRPS; blue bar and pie: Type I PKS; green bars and pie: Hybrid clusters (NRPS-PKS and PKS-NRPS).
Figure 2
Figure 2
(A) NAPDOS classification of the NRPS KS domain. Modular: possess a multidomain architecture consisting of multiple sets of modules; hybridKS: are biosynthetic assembly lines that include both PKS and NRPS components; PUFA: Polyunsaturated fatty acids (PUFAs) are long chain fatty acids containing more than one double bond, including omega-3-and omega-6- fatty acids; Enediyne: a family of biologically active natural products. The enediyne core consists of two acetylenic groups conjugated to a double bond or an incipient double bond within a nine- or ten-membered ring. (B) NAPDOS classification of NRPS C domain. Cyc, cyclization domains catalyze both peptide bond formation and subsequent cyclization of cysteine, serine or threonine residues; DCL, link an L-amino acid to a growing peptide ending with a D-amino acid; Epim, epimerization domains change the chirality of the last amino acid in the chain from L- to D-amino acid; LCL, catalyze formation of a peptide bond between two L-amino acids; modAA, appear to be involved in the modification of the incorporated amino acid; Start, first module of a Non-ribosomal peptide synthase (NRPS).
Figure 3
Figure 3
NAPDOS phylogenetic tree of C domains (environmental domains, the top 3 blast results on RefSeq and the NAPDOS reference sequences). The shadow colors represent the domain classifications (LCL, CYC, Start domains, EPIM, ModAA, Dual, and DCL). The sidebars represent phyla (Proteobacteria, Cyanobacteria, Firmicutes, Actinobacteria, Verrucomicrobia). All the sequences from environmental bins are in red.
Figure 4
Figure 4
NAPDOS tree of KS domains (environmental domains, the top 3 blast results on RefSeq and the NAPDOS reference sequences). The shadow colors represent the domain classifications (Modular, KS1, Iterative, Trans-AT, Hybrid, PUFA, Enediyenes, Type II, and Fabs, Fatty acid synthase). The sidebars represent phyla (Cyanobacteria and Actinobacteria). All the sequences from environmental bins are in red.
Figure 5
Figure 5
Bin 34, NRPS clusters detailed annotation and synteny. The synteny of the clusters with a functional classification for each ORF is given. In addition, for the NPRS biosynthetic ORFS the domain annotations are given. CAL, Co-enzyme A ligase domain; C, condensation; A, adenylation; E, epimerization; TE, Termination; KR, Ketoreductase domain; ECH, Enoyl-CoA hydratase.
Figure 6
Figure 6
Bin 193 and 131 type I PKS clusters detailed annotation and synteny. It is possible to see the synteny of the cluster with the functional classification for each ORF. In addition, for the PKS biosynthetic ORF the domain-specific annotations can be seen. KS, keto-synthase; AT, acyltransferase; KR, ketoreductase; E, epimerization; DH, dehydratase; ER, enoylreductase.

Similar articles

Cited by

References

    1. Ahlert J., Shepard E., Lomovskaya N., Zazopoulos E., Staffa A., Bachmann B. O., et al. (2002). The calicheamicin gene cluster and its iterative type I enediyne PKS. Science 297, 1173–1176. 10.1126/science.1072105 - DOI - PubMed
    1. Albertsen M., Hugenholtz P., Skarshewski A., Nielsen K. L., Tyson G. W., Nielsen P. H. (2013). Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 31, 533–538. 10.1038/nbt.2579 - DOI - PubMed
    1. Amos G. C., Borsetto C., Laskaris P., Krsek M., Berry A. E., Newsham K. K., et al. (2015). Designing and implementing an assay for the detection of rare and divergent nrps and pks clones in European, Antarctic and Cuban soils. PLoS ONE 10:e0138327. 10.1371/journal.pone.0138327 - DOI - PMC - PubMed
    1. Austin M. B., Noel J. P. (2003). The chalcone synthase superfamily of type III polyketide synthases. Nat. Prod. Rep. 20, 79–110. 10.1039/b100917f - DOI - PubMed
    1. Bakuła Z., Safianowska A., Nowacka-Mazurek M., Bielecki J., Jagielski T. (2013). Short communication: subtyping of Mycobacterium kansasii by PCR-restriction enzyme analysis of the hsp65Gene. Biomed Res. Int. 2013:178725. 10.1155/2013/178725 - DOI - PMC - PubMed

LinkOut - more resources