Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 24;111(25):9259-64.
doi: 10.1073/pnas.1401734111. Epub 2014 Jun 9.

Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes

Affiliations

Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes

Hao Wang et al. Proc Natl Acad Sci U S A. .

Abstract

Nonribosomal peptides and polyketides are a diverse group of natural products with complex chemical structures and enormous pharmaceutical potential. They are synthesized on modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) enzyme complexes by a conserved thiotemplate mechanism. Here, we report the widespread occurrence of NRPS and PKS genetic machinery across the three domains of life with the discovery of 3,339 gene clusters from 991 organisms, by examining a total of 2,699 genomes. These gene clusters display extraordinarily diverse organizations, and a total of 1,147 hybrid NRPS/PKS clusters were found. Surprisingly, 10% of bacterial gene clusters lacked modular organization, and instead catalytic domains were mostly encoded as separate proteins. The finding of common occurrence of nonmodular NRPS differs substantially from the current classification. Sequence analysis indicates that the evolution of NRPS machineries was driven by a combination of common descent and horizontal gene transfer. We identified related siderophore NRPS gene clusters that encoded modular and nonmodular NRPS enzymes organized in a gradient. A higher frequency of the NRPS and PKS gene clusters was detected from bacteria compared with archaea or eukarya. They commonly occurred in the phyla of Proteobacteria, Actinobacteria, Firmicutes, and Cyanobacteria in bacteria and the phylum of Ascomycota in fungi. The majority of these NRPS and PKS gene clusters have unknown end products highlighting the power of genome mining in identifying novel genetic machinery for the biosynthesis of secondary metabolites.

Keywords: bioactive compound; biosynthetic gene cluster; data mining; distribution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
The widespread distribution of NRPSs and PKSs across the three domains of life. The phylogenetic analysis is based on 16S or 18S rRNA genes from selected organisms (Table S3) for representative phyla in bacteria and eukarya, and classes in archaea. The midpoint tree was constructed by PhyML 3.0 using the GTR substitution model and with 100 bootstrap replications for each branch. The lineages containing both NRPSs and PKSs, or hybrid NRPS-PKS enzymes are indicated in red, the ones containing only NRPSs are indicated in green, and those containing only PKSs are in blue. The numbers of examined genomes and discovered gene clusters for each phylum or class are next to the taxon name and separated by a slash. The biosynthetic pathways of NRPS and modular PKS not only were found densely distributed among bacterial phyla and fungi, but also were found in animals, plants, and protists in eukarya and archaean strains.
Fig. 2.
Fig. 2.
A Venn diagram of PKS, NRPS, and hybrid gene-cluster numbers. The gene-cluster numbers of bacteria, archaea, and eukarya are shown in red, purple, and blue, respectively. The values in parentheses represent the numbers of hybrid enzymes that contain both NRPS and PKS core domains.
Fig. 3.
Fig. 3.
A heat map showing sequence similarities between l-amino acid condensation domains among the major lineages. Sequence similarities are measured by the bit scores of the reciprocal bla2seq alignments and indicated in color as the scheme shows. Bit scores are shown in red if they equal 600 or more and in white if they equal 100 or less. The self-hits in the diagonal line were omitted for clarity.
Fig. 4.
Fig. 4.
Distribution of bacterial NRPS/PKS enzymes ranked according to their domain numbers. A total of 15,889 enzymes were found in 2,976 gene clusters in bacteria. These enzymes were grouped according to the number of their domains. The nonmodular enzymes showed high levels of abundance (8,906) compared with other multidomain enzymes. Enzymes from NRPS gene clusters are indicated in red, PKS gene clusters in green, and hybrid gene clusters in yellow.
Fig. 5.
Fig. 5.
Examples of gene clusters comprised of nonmodular enzymes. A total of 314 gene clusters formed mostly by nonmodular enzymes were found in bacteria. Examples of two hybrid gene clusters and a putative acinetobactin biosynthesis gene cluster are shown. The domains are indicated by abbreviations as adenylation (A), peptidyl carrier domain (PCP), condensation (C), acyltransferase (AT), acyl carrier or peptidyl carrier domain (PP), ketosynthase (KS), thioesterase (TE), epimerization (E), heterocyclization (H), ketoreductase (KR), enoylreductase (ER), aminotransferase (Amino), and 4′-phosphopantetheinyl transferase (ACPS). The ABC transport system proteins are indicated in blue, other tailoring enzymes in light green, and hypothetical proteins in gray.
Fig. 6.
Fig. 6.
Phylogenetic analysis of condensation domains C2 from siderophore biosynthetic enzymes with gradient-domain organizations. The neighbor-joining tree was constructed by MEGA5.1 (46) using Poisson model and with 100 bootstrap replications for each branch. These siderophore biosynthetic pathways share an unusual pair of tandem heterocyclization domains and others with similar composition but in gradient organizations, which are congruent with the phylogeny of the C2 domains. These domains are responsible for the biosynthesis of the 2,3-dihydroxypheny-5-methyloxazoline-acyl group, which is synthesized by the double heterocyclization (H) domains from a 2,3-dihydroxybenzoate and an l-threonine that is activated by the adenylation (A) domain, and fused by the condensation (C) domain C2 with other substrates. The activated l-threonine and the derived group are both tethered by the peptidyl carrier (PCP) domain.

Comment in

References

    1. Cane DE, Walsh CT, Khosla C. Harnessing the biosynthetic code: Combinations, permutations, and mutations. Science. 1998;282(5386):63–68. - PubMed
    1. Finking R, Marahiel MA. Biosynthesis of nonribosomal peptides1. Annu Rev Microbiol. 2004;58:453–488. - PubMed
    1. Weissman KJ, Leadlay PF. Combinatorial biosynthesis of reduced polyketides. Nat Rev Microbiol. 2005;3(12):925–936. - PubMed
    1. Kopp F, Marahiel MA. Where chemistry meets biology: the chemoenzymatic synthesis of nonribosomal peptides and polyketides. Curr Opin Biotechnol. 2007;18(6):513–520. - PubMed
    1. Walsh CT. The chemical versatility of natural-product assembly lines. Acc Chem Res. 2008;41(1):4–10. - PubMed

Publication types