Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Nov;169(3):1881-96.
doi: 10.1104/pp.15.01214. Epub 2015 Sep 14.

The Arabidopsis Chloroplast Stromal N-Terminome: Complexities of Amino-Terminal Protein Maturation and Stability

Affiliations

The Arabidopsis Chloroplast Stromal N-Terminome: Complexities of Amino-Terminal Protein Maturation and Stability

Elden Rowland et al. Plant Physiol. 2015 Nov.

Abstract

Protein amino (N) termini are prone to modifications and are major determinants of protein stability in bacteria, eukaryotes, and perhaps also in chloroplasts. Most chloroplast proteins undergo N-terminal maturation, but this is poorly understood due to insufficient experimental information. Consequently, N termini of mature chloroplast proteins cannot be accurately predicted. This motivated an extensive characterization of chloroplast protein N termini in Arabidopsis (Arabidopsis thaliana) using terminal amine isotopic labeling of substrates and mass spectrometry, generating nearly 14,000 tandem mass spectrometry spectra matching to protein N termini. Many nucleus-encoded plastid proteins accumulated with two or three different N termini; we evaluated the significance of these different proteoforms. Alanine, valine, threonine (often in N-α-acetylated form), and serine were by far the most observed N-terminal residues, even after normalization for their frequency in the plastid proteome, while other residues were absent or highly underrepresented. Plastid-encoded proteins showed a comparable distribution of N-terminal residues, but with a higher frequency of methionine. Infrequent residues (e.g. isoleucine, arginine, cysteine, proline, aspartate, and glutamate) were observed for several abundant proteins (e.g. heat shock proteins 70 and 90, Rubisco large subunit, and ferredoxin-glutamate synthase), likely reflecting functional regulation through their N termini. In contrast, the thylakoid lumenal proteome showed a wide diversity of N-terminal residues, including those typically associated with instability (aspartate, glutamate, leucine, and phenylalanine). We propose that, after cleavage of the chloroplast transit peptide by stromal processing peptidase, additional processing by unidentified peptidases occurs to avoid unstable or otherwise unfavorable N-terminal residues. The possibility of a chloroplast N-end rule is discussed.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Conceptual illustration of Nt maturation of n-encoded and p-encoded proteins. Ac, Acetylated; MAP, Met amino peptidase; NAT, N-acetyltransferase; N-term, N-terminal; PDF, peptide deformylase. A, Nt maturation of n-encoded plastid proteins including removal of cTP by SPP and potential subsequent Nt modifications. B, Nt maturation of p-encoded proteins. *, The removal depends on the penultimate residue, generally following the N-terminal Met Excision (NME) rule; **, N-terminal acetylation typically occurs only for selected residues; “Results”).
Figure 2.
Figure 2.
Nt amino acid frequency for stroma-exposed n-encoded chloroplast proteins. A, All detected stromal Nti (341), excluding unprocessed proteins and obvious breakdown products (Supplemental Table S4). This shows that Ala and Ser are heavily favored as Nt residues, followed by Val and Thr, while 14 residues were underrepresented (Gly, 14×; Gln, 14×; Glu, 10×; Ile, 6×; Arg, 5×; Lys, 5×; Met, 4×; Asn, 3×; Leu, 3×; Cys, 2×; Phe, 2×; Trp, 1×; Tyr, 1×; and Asp, 1×) or not observed (Pro and His). A significant portion of these highly favored residues were acetylated (Val, 54%; Thr, 47%; Ala, 21%; and Ser, 19%), whereas the acetylation rate for other residues was either 0% (Tyr, Leu, Phe, Asp, and Cys) or 100% (Trp; acetylation is indicated as ac). B, Single highest ranked N terminus per protein (165), excluding Nti with less than two SPC. Selecting a single best or highest ranked N terminus for each protein (see “Materials and Methods”) hardly influenced the Nt amino acid frequency, except that it slightly decreased the dominance of Ala, increased Ser, and reduced acetylated Ser. Less frequent residues were Gly (9×), Glu (5×), Gln (4×), Lys (4×), Arg (3×), Met (3×), Asn (2×), Cys (2×), Leu (1×), Ile (1×), Trp (1×), Tyr (1×), and Asp (1×), whereas Phe, Pro, and His were not observed. C, Single highest ranked Nti as in B but normalized (weighted) to the frequency of each amino acid in the known (from the Plant Proteome Data Base [PPDB]; 1,575 proteins) n-encoded plastid proteome with predicted cTPs removed.
Figure 3.
Figure 3.
Analysis of amino acid conservation around experimentally determined Nti for n-encoded stroma-exposed proteins and comparison with Nti generated by in vitro SPP cleavage assays reported in the literature. As per consensus, P1′ is the observed Nt residue and P1 is the residue immediate upstream of P1′. Solid arrows indicate the experimentally determined Nt residue. For plots A to D, the best-ranked Nti of 165 plastid proteins with n-encoded stroma-exposed Nti were used. In all plots, proteins were aligned around the experimentally determined Nt residue (P1′). Color coding for residues is as follows: blue, basic residues (R, K, and H); red, acidic residues (D and E); black, apolar, or hydrophobic residues (A, V, L, I, P, F, W, and G); purple, reactive residues (M and C); and green, uncharged, polar residues (S, T, Y, Q, and N). A, Sequence Logo of the 165 stroma-exposed proteins shows a weak motif around the mature Nt. The conservation level of amino acids in this sequence alignment is represented as vertical stacks of the amino acid symbols; the stack height reflects the level of conservation. B to D, iceLogo plots of the stroma-exposed proteins in which the amino acid frequency is normalized (weighted) against the total amino acid frequency of the n-encoded chloroplast proteome (from PPDB; 1,575 proteins). Amino acid residues significantly enriched are shown above the x axis, whereas those underrepresented are shown below the x axis. Residues below the x axis colored in pink were entirely absent in this position in the experimental sequences. B, iceLogo of the 165 n-encoded stroma-exposed proteins (P = 0.05). C, iceLogo plots (P = 0.01) for n-encoded stroma-exposed proteins for which the residue immediately upstream of the experimentally determined Nti (P1) is an Ala (58 sequences), Cys (35 sequences), or Met (22 sequences). D, iceLogo plots (P = 0.01) for n-encoded stroma-exposed proteins for which the experimentally determined Nti (P1′) is an Ala (63 sequences), Ser (53 sequences), or Val (26 sequences). E, Sequence logo for eight sequences shown to be cleaved in vitro by SPP (seven using pea SPP and one using C. reinhardtii SPP), with SPP purified from chloroplasts or recombinant SPP expressed in Escherichia coli and immobilized on beads via an Nt biotin tag. Substrates are from a range of organisms (wheat [Triticum aestivum], tomato [Solanum lycopersicum], spinach [Spinacia oleracea], pea, C. reinhardtii, Arabidopsis, Saliva pratensis). Sequences and other details are provided in Supplemental Table S5.
Figure 4.
Figure 4.
Nt amino acid frequency for stroma-exposed p-encoded proteins and comparison with all known lumenally exposed Nti (both p-encoded and n-encoded proteins). Detailed information is available in Supplemental Table S6. A, The penultimate residues (i.e. residues immediately downstream of the initiating Met) of 65 p-encoded proteins for which the N terminus is facing the stroma. This sequence information is derived from the protein sequences listed in The Arabidopsis Information Resource (TAIR; https://www.arabidopsis.org/). Within this group, there are three sets of identical homologs (ribosomal proteins S7A,B, ribosomal proteins S12A,B,C, and a full-length YCF1.2 protein and a truncated form; for details, see Supplemental Table S6). Rather than including each of these homologs, we counted each set only once, thus resulting into 61 Nti. B, The predicted Nt residues of mature proteins after application of the general NME rule for the p-encoded proteins in A. C, Experimentally determined Nt residues for p-encoded proteins for which the N terminus is facing the stroma (a total of 47 proteins). Experimental evidence was obtained from the TAILS experiments described in this study, from semitryptic or NAA Nti detected previously (Zybailov et al., 2008, 2009; Bienvenut et al., 2012), and additional data from in-house experiments in PPDB. Also included is information from Giglione et al. (2004), which were mostly based on Nt Edman sequencing data from various plant species. We note that Edman sequencing cannot sequence proteins for which the Nt is NAA; these modified Nti are blocked, preventing Edman chemistry. The experimental Nt information from these other plant species was projected onto Arabidopsis homologs if the Nti were identical. D, Experimentally determined Nt residues for 25 p-encoded proteins for which the N terminus is facing the stroma as determined by TAILS and in-house experiments in PPDB. This is a subset of the proteins in C. E, Experimentally determined Nt residues for 39 p-encoded and n-encoded proteins for which the N terminus is facing the thylakoid lumen. Experimental evidence was obtained from the TAILS experiments, previous publications (Zybailov et al., 2008, 2009), and additional data in PPDB (for details, see Supplemental Table S7).
Figure 5.
Figure 5.
Working model for Nt maturation of n-encoded proteins and the classification of different types of Nti. A, Model for the generation of mature and stable Nti of n-encoded chloroplast proteins. Upon chloroplast import, the cTPs of precursor proteins are either cleaved at a specific single site or cleaved at closely spaced multiple positions. Proteins with unwanted and/or unstable Nti are further processed by one or more stromal aminopeptidases to stabilize the proteins. B, Classification of different types of chloroplast stroma-exposed Nti and examples. We distinguish three types of amino acids: i, amino acids that are very frequent in the Nt position and that are presumably very stable in the chloroplast stroma; ii, Nti with reversible PTMs and that play a functional role; and iii, amino acids that are not or rarely observed and likely result in the destabilization of proteins in the chloroplast when these Nti are exposed to the stroma. Group iv shows examples of proteins that were observed with rare amino acids at the Nt position; these are discussed in the text.

References

    1. Alban C, Tardif M, Mininno M, Brugière S, Gilgen A, Ma S, Mazzoleni M, Gigarel O, Martin-Laffon J, Ferro M, et al. (2014) Uncovering the protein lysine and arginine methylation network in Arabidopsis chloroplasts. PLoS One 9: e95512. - PMC - PubMed
    1. Apel W, Schulze WX, Bock R (2010) Identification of protein stability determinants in chloroplasts. Plant J 63: 636–650 - PMC - PubMed
    1. Bachmair A, Finley D, Varshavsky A (1986) In vivo half-life of a protein is a function of its amino-terminal residue. Science 234: 179–186 - PubMed
    1. Bienvenut WV, Sumpton D, Martinez A, Lilla S, Espagne C, Meinnel T, Giglione C (2012) Comparative large scale characterization of plant versus mammal proteins reveals similar and idiosyncratic N-alpha-acetylation features. Mol Cell Proteomics 11: M111 015131. - PMC - PubMed
    1. Bonissone S, Gupta N, Romine M, Bradshaw RA, Pevzner PA (2013) N-terminal protein processing: a comparative proteogenomic analysis. Mol Cell Proteomics 12: 14–28 - PMC - PubMed

Publication types

MeSH terms