Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;24(7):2765-78.
doi: 10.1105/tpc.112.099085. Epub 2012 Jul 20.

Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase-mediated biosynthesis

Affiliations

Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase-mediated biosynthesis

Joshua S Mylne et al. Plant Cell. 2012 Jul.

Abstract

The cyclic miniprotein Momordica cochinchinensis Trypsin Inhibitor II (MCoTI-II) (34 amino acids) is a potent trypsin inhibitor (TI) and a favored scaffold for drug design. We have cloned the corresponding genes and determined that each precursor protein contains a tandem series of cyclic TIs terminating with the more commonly known, and potentially ancestral, acyclic TI. Expression of the precursor protein in Arabidopsis thaliana showed that production of the cyclic TIs, but not the terminal acyclic TI, depends on asparaginyl endopeptidase (AEP) for maturation. The nature of their repetitive sequences and the almost identical structures of emerging TIs suggest these cyclic peptides evolved by internal gene amplification associated with recruitment of AEP for processing between domain repeats. This is the third example of similar AEP-mediated processing of a class of cyclic peptides from unrelated precursor proteins in phylogenetically distant plant families. This suggests that production of cyclic peptides in angiosperms has evolved in parallel using AEP as a constraining evolutionary channel. We believe this is evolutionary evidence that, in addition to its known roles in proteolysis, AEP is especially suited to performing protein cyclization.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
TIPTOP Proteins from M. cochinchinensis. (A) Schematic of a typical squash TI precursor from the towel gourd (Luffa cylindrica) TGTI-II precursor compared with TIPTOP1-3 from gac (M. cochinchinensis). aa, amino acids. (B) Predicted sequence of a cyclic knottin domain from TIPTOP3 and its flanks. (C) Region containing terminal knottin TI-6 from TIPTOP3. (D) BOXSHADE alignment of six single-unit knottin precursors with TIPTOP1-3. See Methods for full details of the sources for the six single-unit knottin precursor sequences. This alignments shows that all nine predicted proteins share an ER signal sequence (brown, predicted cleavages shown with arrowheads), a conserved prodomain of unknown function (orange) and the terminal knottin domain (green, known cleavages shown with arrowheads).
Figure 2.
Figure 2.
TIPTOP-Derived Knottins. (A) Sequence alignment of new TI sequences (TI-4 to TI-8) with known sequences (TI-1 to TI-3). Asterisks indicate an acyclic peptide. TI-1, TI-2, TI-4, TI-7, and TI-8 are backbone cyclic. The disulfide connectivity determined by NMR for TI-2 and TI-5 is shown below the alignment. (B) LC-MS profile of M. cochinchinensis peptide extract with sequenced knottins marked. The two peaks with asterisks we suspect contain isomers of identical mass to nearby peaks but with isoaspartyl bonds, a feature of these cyclic knottins observed during their initial characterization (Hernandez et al., 2000).
Figure 3.
Figure 3.
Sequence and Structural Alignment of Cyclic and Acyclic Knottins. (A) Sequence of cyclic TI-2 (magenta) and acyclic TI-5 (green). The ligation point in TI-2 is marked with an arrow. Residues that differ between TI-2 and TI-5 are marked with asterisks. The three disulfide linkages are shown by connecting bars. (B) Overlay of structural models for TI-2 (magenta, 1HA9) and the newly acquired TI-5 structure (green, 2LJS). Aside from the obvious ligating Ser-Gly-Ser-Asp sequence in TI-2, TI-5 has a root mean square deviation of 0.55 Å over the backbone residues 2 to 30. The ligation point in TI-2 is marked on its structure with an arrow. The N-terminal pyrol ring of TI-5 is displayed (pGlu1).
Figure 4.
Figure 4.
DNA Repeat Analysis Using TIPTOP2 Reveals Imperfect Palindromes and Significant Low Energy Folding Structures. (A) Reconstruction of the MEME raw output, showing the location of 50 base repeats found on the sense (+) and minus (−) strand. Below the MEME output, the equivalent regions are marked on the TIPTOP2 protein schematic. (B) When the regions encoding this sequence were compared with their reverse complements (rev.c.), it revealed the sequences are highly complementary. (C) Putative DNA hairpins in TI-1 and TI-5 generated using CONTRAfold. (D) Reconstruction of the MEME raw output when repeat size maxima was uncapped, showing the location of high-scoring 113-mer repeats. (E) A histogram displaying the empirical probability of 113-mers with a given folding free energy, estimated using 30,190 random 113-mers extracted from unspliced Arabidopsis mRNA. (F) A summary of the folding free energies and statistical significance of the 113-mer repeats shown in (D). The P value is the area under the histogram corresponding to free energies greater than or equal to the given value; the adjusted P value (adj p) is adjusted for six multiple tests, because we chose the repeat copy with the lowest free energy.
Figure 5.
Figure 5.
In Vivo Processing of TIPTOP2. (A) Schematic of TIPTOP2 showing the Asn residue preceding each cyclic knottin domain, the terminal Asp of each cyclic knottin domain, and the Lys residue preceding the terminal knottin. (B) MALDI-MS analysis of seed peptide extracts of either M. cochinchinensis or Arabidopsis containing OLEOSIN:TIPTOP2 in either wild-type (WT) or aep null mutant backgrounds. The identity of M. cochinchinensis masses 3378.5 and 3434.9 are not known; those that match known peptides are labeled. The asterisks in the OLEOSIN:TIPTOP2 in wild-type spectra denote misprocessed peptides. For TI-5, this mass is consistent with failure to pyrolate (+17 D), whereas for TI-1, TI-2, and TI-4, the masses marked by 1*, 2*, and 4*, respectively are +18-D masses consistent with noncyclized peptide. For comparison, nontransgenic wild-type and aep null mutant profiles are shown. See Supplemental Figure 9 online for MALDI-MS spectra with a broader mass range. (C) Ions within the LC-MS data of the same extracts with ranges 827.2 to 827.4, 853.0 to 853.2, 863.5 to 863.7, and 870.5 to 870.7 D confirmed each peak in Arabidopsis matches its counterpart in M. cochinchinensis. The peak with the asterisk marks what we suspect is a TI-1 isomer. For fully annotated LC-MS traces, see Supplemental Figure 10 online. [See online article for color version of this figure.]
Figure 6.
Figure 6.
Independent Evolution of Plant Cyclic Peptides That Use the Same AEP-Mediated Processing. (A) Partial angiosperm phylogeny based on rbcL sequences. Species known to contain AEP-mediated cyclic peptides and their precursors are in green (SFTI PDB#1SFI Helianthus, Asteraceae; O1 PDB#1NBJ, Viola, Violaceae; CterM PDB#2LAM, Clitoria, Fabaceae; kalata B1 PDB#1NB1, Oldenlandia, Rubiaceae; Petunia, Solanaceae; TI-2 PDB#1HA9, Momordica, Cucurbitaceae). A range of model plants is included, and their names are highlighted in orange. For a more complete angiosperm phylogeny, see Supplemental Figure 11 and Supplemental Table 6 online for the alignment used. For taxa representing each used family name, see Methods. aa, amino acids. (B) BOXSHADE alignment of each cyclic peptide domain (orange) and its flanks. This shows the three peptide classes that use this AEP-mediated processing and show the conserved cyclic peptide domain has a proto–N-terminal Gly and a proto–C-terminal Asn or Asp (Asx) (Asp and Asn are the target residues for AEP). The residues trailing the proto–C-terminal Asn or Asp are usually a small side-chained residue at the P1′ and either Leu or Ile at the P2′ position.

References

    1. Akaike H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19: 716–723
    1. Avrutina O., Schmoldt H.-U., Gabrijelcic-Geiger D., Le Nguyen D., Sommerhoff C.P., Diederichsen U., Kolmar H. (2005). Trypsin inhibition by macrocyclic and open-chain variants of the squash inhibitor MCoTI-II. Biol. Chem. 386: 1301–1306 - PubMed
    1. Bailey T.L., Elkan C. (1994). Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, R. Altman, D. Brutlag, P. Karp, R. Lathrop, and D. Searls, eds (Menlo Park, CA: AAAI Press), pp. 28–36 - PubMed
    1. Bergmann M., Fruton J.S. (1938). Some synthetic and hydrolytic experiments with chymotrypsin. J. Biol. Chem. 124: 321–329
    1. Björklund Å.K., Ekman D., Elofsson A. (2006). Expansion of protein domain repeats. PLoS Comput. Biol. 2: e114. - PMC - PubMed

Publication types

MeSH terms

Associated data

LinkOut - more resources