Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 26;114(52):E11131-E11140.
doi: 10.1073/pnas.1716245115. Epub 2017 Dec 11.

Discovery of the leinamycin family of natural products by mining actinobacterial genomes

Affiliations

Discovery of the leinamycin family of natural products by mining actinobacterial genomes

Guohui Pan et al. Proc Natl Acad Sci U S A. .

Abstract

Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm-type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.

Keywords: combinatorial biosynthesis; genome mining; leinamycin; natural products discovery; structural diversity.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Two complementary approaches to LNM structural diversity by combinatorial biosynthesis. (A) Knowledge-based approach by inactivating lnmE in S. atroolivaceus S-140, affording the SB3033 mutant strain that specifically produced LNM E1 (28). Shaded groups denote changes resulting from the ΔlnmE mutation. Red * indicates the lnmE gene that has been inactivated. (B) Discovery-based approach by targeting the DUF–SH didomain to mine bacterial genomes, affording strains that are predicted to produce a family of LNM-type natural products as exemplified by GNM A. The shaded group denotes the structural feature targeted by the DUF–SH didomain (27).
Fig. 2.
Fig. 2.
Survey of bacterial genomes (∼48,780) available from public databases and strains (∼5,000) from the actinomycetes collection at TSRI (including ∼100 from the Naicons collection and ∼500 from Myongji University), identifying 49 potential producers for LNM-type natural products by targeting the DUF–SH didomain. (A) Phylogenetic analysis of the 49 producers, based on the translated 1.2-kb internal fragment of DUF–SH didomains (27) and with S. atroolivaceus S-140 as a reference (21), affording 18 distinct clades (clades I to XVIII) when subjected to ∼70% amino acid identity cutoff (also see SI Appendix, Fig. S3B). CalE6 (AAM94792) from Micromonospora echinospora was used as the outgroup. Numbers in parentheses are the hits identified from each of the clades. Representative hits from each of the clades that have been genome-sequenced are listed. Blue dots indicate the nine hits from TSRI collection whose genomes have been sequenced. The strains from which the production of LNM-type natural products has been confirmed are highlighted in red. (B) The 28 lnm-type gene clusters from 17 of the 18 clades, in comparison with the lnm gene cluster (clade I), highlighting the rich structural diversities of the encoded family of LNM-type natural products. Clusters within the same clades are highly homologous, indicative of producing highly similar natural products. Genes are color-coded based on their proposed functions (see SI Appendix, Tables S4–S36 for annotations).
Fig. 3.
Fig. 3.
Functional diversity of the A proteins or domains from LNM-type biosynthetic machineries. (A) Phylogenetic analysis of the A proteins from the 28 LNM-type machineries, in comparison with LnmQ, which specifies d-Ala (29), revealing two major groups. AfsK (BAA08229) from Streptomyces coelicolor was used as the outgroup. Roman numerals in parentheses refer to the corresponding clades shown in Fig. 2A. The architectures of A proteins from different groups are shown (Right), with A as a discrete protein (group I), A-PCP didomain (group II), and A-PCP accompanied by an extra N-terminal sequence (?) with/without an additional C-terminal thioesterase domain (the rest). The colored dots denote A proteins whose substrate specificities have been confirmed experimentally or deduced from the isolated natural products: green, d-Ala; red, ACC; blue, l-Thr; black, preferred substrate not detected among the 22 amino acids tested (also see SI Appendix, Table S37 for substrate specificities predicted based on the NRPS codes). (B) In vitro assay of representative A proteins to determine their substrate specificities, as exemplified by GnmS, WsmQ, and CB01373_Q that specify ACC, d-Ala, and l-Thr, respectively (also see SI Appendix, Fig. S29). Error bars are generated from three replicates.
Fig. 4.
Fig. 4.
Discovery of GNMs and WSMs exemplifying the rich structural diversity of the LNM family of natural products. (A) Genetic organizations of the gnm and wsm gene clusters in comparison with the lnm gene cluster. (B) HPLC analysis of fermentations of the S. sp. CB01883 wild-type (I), SB21001 (i.e., ΔgnmB) (II), SB21002 (i.e., ΔgnmB/pBS21005) (III), SB21003 (i.e., ΔgnmO) (IV), and SB21004 (i.e., ΔgnmO/pBS21007) (V) mutant strains. (C) Structures of GNMs isolated from the S. sp. CB01883 wild-type (GNM A and B) and SB21003 mutant (GNM B, B1, B2, and B3). (D) Determination of the absolute configuration of GNMs at C3 to be S as shown based on the differences of the chemical shifts in 1H NMR of H2 and H23 between (R)- and (S)-PGME derivatives of GNM B2. Two major conformations are shown based on the analysis of their ROESY correlation signals (see SI Appendix, Fig. S14 for details). (E) Confirmation of the absolute configuration of LNM E2 at C3 to be R as shown based on the differences of the chemical shifts in 1H NMR of H4, H5, and H22 between the (R)- and (S)-PGME derivatives of LNM E2 (see SI Appendix, Fig. S16 for details). (F) HPLC analysis of fermentations of the S. sp. CB02120-2 wild-type (I), SB22001 (i.e., ΔwsmW) (II), SB22002 (i.e., ΔwsmZ3) (III), SB22003 (i.e., ΔwsmZ3/pBS22006) (IV), SB22004 (i.e., ΔwsmZ4) (V), and SB22005 (i.e., ΔwsmZ4/pBS22008) (VI) mutant strains. (G) Structures of WSMs isolated from the S. sp. CB02120-2 wild-type strain. (H) Determination of the absolute configuration of WSMs at C3 to be S as shown based on the differences of the chemical shifts in 1H NMR of H2 and H24 between (R)- and (S)-PGME derivatives of WSM A2 (see SI Appendix, Fig. S24 for details). Ha and Hb denote one of the two geminal hydrogens appearing at lower and higher field, respectively, in 1H NMR. GNM A, ●; GNM B, ◆; GNM B1, ◇; GNM B2, ○; GNM B3, ▼; WSM A1, ✦; WSM A2, ∇.
Fig. 5.
Fig. 5.
Nature’s combinatorial biosynthesis for the LNM family of natural products. (A) The LNM-type biosynthetic machineries, featuring a hybrid NRPS–AT–less type I PKS with varying substrate specificity and modification domains, to account for the structural diversity found within the LNM family of natural products. Domains marked with red dotted circles vary among the machineries. Gaps between domains denote protein boundaries, with red dotted lines denoting that the two domains (enzymes) are fused in some of the machineries. (B) A composite structure depicting varying features of the LNM family of natural products that could be correlated with different modules shown with colored squares. Those marked with asterisks denote the structural motifs that have been discovered from the structures of LNMs, GNMs, and WSMs. (C) A mosaic view of the structural diversity of the LNM family of natural products, highlighting Nature’s intrinsic use of combinatorial biosynthesis. The roman numerals (I–V, VII–XVIII) represent the 17 different clades of potential producers of LNM-type natural products (Fig. 2A).

Similar articles

Cited by

References

    1. Newman DJ, Cragg GM. Natural products as sources of new drugs from 1981 to 2014. J Nat Prod. 2016;79:629–661. - PubMed
    1. Shen B. Polyketide biosynthesis beyond the type I, II and III polyketide synthase paradigms. Curr Opin Chem Biol. 2003;7:285–295. - PubMed
    1. Fischbach MA, Walsh CT. Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: Logic, machinery, and mechanisms. Chem Rev. 2006;106:3468–3496. - PubMed
    1. Walsh CT. The chemical versatility of natural-product assembly lines. Acc Chem Res. 2008;41:4–10. - PubMed
    1. Cane DE, Walsh CT, Khosla C. Harnessing the biosynthetic code: Combinations, permutations, and mutations. Science. 1998;282:63–68. - PubMed

Publication types