Dual lysine and N-terminal acetyltransferases reveal the complexity underpinning protein acetylation
- PMID: 32633465
- PMCID: PMC7339202
- DOI: 10.15252/msb.20209464
Dual lysine and N-terminal acetyltransferases reveal the complexity underpinning protein acetylation
Abstract
Protein acetylation is a highly frequent protein modification. However, comparatively little is known about its enzymatic machinery. N-α-acetylation (NTA) and ε-lysine acetylation (KA) are known to be catalyzed by distinct families of enzymes (NATs and KATs, respectively), although the possibility that the same GCN5-related N-acetyltransferase (GNAT) can perform both functions has been debated. Here, we discovered a new family of plastid-localized GNATs, which possess a dual specificity. All characterized GNAT family members display a number of unique features. Quantitative mass spectrometry analyses revealed that these enzymes exhibit both distinct KA and relaxed NTA specificities. Furthermore, inactivation of GNAT2 leads to significant NTA or KA decreases of several plastid proteins, while proteins of other compartments were unaffected. The data indicate that these enzymes have specific protein targets and likely display partly redundant selectivity, increasing the robustness of the acetylation process in vivo. In summary, this study revealed a new layer of complexity in the machinery controlling this prevalent modification and suggests that other eukaryotic GNATs may also possess these previously underappreciated broader enzymatic activities.
Keywords: acetylome; acetyltransferase; co- and post-translational modifications; plastid; quantitative proteomics.
© 2020 The Authors. Published under the terms of the CC BY 4.0 license.
Conflict of interest statement
The authors declare that they have no conflict of interest.
Figures
Phylogenetic tree of GNAT candidates from Arabidopsis thaliana (black letters), Saccharomyces cerevisiae (orange letters), and Escherichia coli (green letters) containing the acetyltransferase Pfam domains (PF0058, PF13302, PF13508, PF13673) (Finn et al, 2006). GNAT family sequences were aligned with ClustalW, and a phylogenetic tree was designed by applying the neighbor‐joining method. Bootstrap analysis was performed using 2,000 replicates, whereby the resulting bootstrap values (values ≥ 20) are indicated next to the corresponding branches. The tree‐specific topology was tested by maximum parsimony analysis. GNAT candidates with a putative organellar localization (TargetP1.1) were highlighted with a green background and named as GNAT1 to GNAT10 in relation to their position in the phylogenetic tree. Squares, triangles, and circles describe the specific acetylation activity, which was reported in literature. The metabolic activity of GNAT2 corresponds to serotonin acetyltransferase (Lee et al, 2014).
Schematic overview of organellar GNATs’ secondary structure organization (including AtNAA and EcRiml for comparison). Secondary structural elements of the GNAT candidates were determined using Jpred tools in combination with structure homology models (Swiss‐model) and are displayed in red (α‐helixes), green (β‐strands), and orange (supplementary secondary elements). All candidates were predicted with a mitochondrial or a chloroplastic transit peptide (cTP) using TargetP. The mature form of these candidates is released after the excision of this cTP. Positions of main and secondary Acyl‐CoA binding domain (Ac‐CoA BD) are shown. C, D, A, and B design the four conserved motifs comprising what is referred as the N‐acetyltransferase domain (Dyda et al, 2000).
- A–D
Confocal laser scanning microscopy images of Arabidopsis Col‐0 protoplasts transiently expressing a GNAT6‐GFP (35S:GNAT6‐GFP) fusion protein. The GNAT6‐GFP signal shows a spotted pattern in different subcellular compartments. When indicated, protoplasts were either (B) transiently co‐transformed with a plasmid enabling the expression of an inner nuclear membrane marker (INM: SUN1‐OFP, Rips et al, 2017), (C) treated with Hoechst 33342 (Thermo Fisher) reagent for DNA staining for identification of the nucleus (Nuc), or (D) treated with MitoTracker (Mt: MitoTracker Orange CMTMRos, Invitrogen) for staining of mitochondria. GFP reporter signal (yellow), chlorophyll autofluorescence (pink), merged fluorescence signals and the bright field channel (BF). The scale bar represents a size of 20 μm.
Anti‐acetyllysine Western blot analyses and total protein stains of E. coli cell extracts. Cellular protein extracts before and after induction of His6‐MBP‐GNAT expression were separated on 12% acrylamide gels and immunoblotted by using an anti‐acetyllysine antibody or stained with Coomassie dye. GNAT1, 2, 5, 6, 7, and 10 as well as His6‐MBP control were expressed in E. coli BL21(DE3)pLysS. GNAT3 and 4 were expressed in Rosetta(DE3). In the total protein stains, recombinant GNAT protein constructs were highlighted by blue arrowheads. Protein expression before (ni) and after (i) induction with IPTG is indicated. The luminescence signal, indicating acetylated proteins, was usually recorded after 40–120 s. Since MBP‐GNAT4 expression resulted in a saturated signal, the luminescence was additionally recorded for 10 s (indicated by an asterisk).
Sequence logos of all unique lysine acetylation sites after GNAT expression in E. coli. Uniprot E. coli protein sequences were used as background population (sequence logos were generated using iceLogo (Maddelein et al, 2015)).
Number of characterized substrates with significant increases of NTA yields. The NTA yields of all retrieved N‐termini upon a given GNAT expression were divided into four categories and each column is the result of the sum of all categories: 50–100%, dark green; 20–50%, light green; 10–20%, orange; 5–10%, red. The number of protein occurrence with a yield higher than 30% is also displayed to visualize the most significant acetylation status of each GNAT. The respective overexpression level of each GNAT as assessed from gel electrophoresis displayed in Fig 2A is indicated below (++++ is for very high expression, +++ for high, ++ for medium, + for intermediate).
Overview of GNAT substrate specificities. The number of NTAed substrates with each of the six major active GNATs is displayed in a pie chart according to the N‐terminal pattern retrieved. The dataset is identical to that of panel A. N‐terminal residues of proteins starting with Met are displayed in dark and light orange while all the other likely resulting from NME are colored with a blue pallet. Very similar pictures are obtained if only N‐termini with NTA yields over 50% are figured. The starting, acetylated amino acid is indicated and labeled with the cytosolic Nat‐type substrate specificity (i.e., NatA‐F) according to the classic distribution available in Aksnes et al (2016). Details of the fraction of N‐termini starting with M are given in Fig EV6.
Table showing examples of the selectivity of all plastid GNATs on a selection of retrieved N‐termini. The table is extracted from Dataset EV3 to illustrate the concept and is not exhaustive. The 10 first amino acids of the N‐α‐acetylated proteins are indicated. The color code is green, positive; red, negative; gray means that the data are missing, i.e., that the peptide was not quantified.
IceLogo representation (Colaert et al, 2009) of the N‐termini substrates of GNAT2 vs all that of other GNATs. To construct the dataset, all GNAT2 substrates with NTA yield threshold > 30% were selected in the positive set. The negative set corresponded to the compilation of all substrates with a threshold > 30% of all other seven GNATs. The color symbol is associated with the default choice, which the software proposes for each class of amino acid: green is for the class of small hydrophilic uncharged residues including S, T, G. Acidic residues including D or E are colored in red. Positively charged residues including K, R, and H are displayed in blue. N or Q is in purple. Hydrophobic residues (A, I, M, V, L, W, Y, F, P) are shown in black.
Venn diagram representing the overlap between the N‐termini sets identified in wild‐type and nsi mutant lines. N‐termini from four replicates of Arabidopsis wild‐type and two independent nsi knockout lines (nsi‐1 and nsi‐2) were retrieved as previously reported (Koskela et al, 2018) and compared. More than 300 N‐termini could be retrieved in all samples (Linster et al, 2015; Huber et al, 2020)
Venn diagram of quantified NTAed proteins. Half of the retrieved N‐termini could be quantified and 173 were common to all samples.
Comparison of NTA yield of retrieved N‐termini of proteins starting at position 1 or 2. The majority of these proteins, corresponding mostly to cytosolic components, undergoing or not to N‐terminal methionine excision, were not affected by inactivation of GNAT2. For statistical analyses, nsi‐1 and nsi‐2 were pooled and compared to the wild type. Two independent technical replicates of four biological replicates for each of the WT, nsi‐1, and nsi‐2 samples were analyzed. Error bars are ± SD.
Comparison of NTA yield of retrieved N‐termini of proteins starting at positions > 2. Clear alteration of NTA yield was observed in nsi mutant lines in the pool of nuclear‐encoded plastid proteins. nsi‐1 and nsi‐2 samples were treated as in (C). Error bars are ± SD (see details of sampling in panel C).
Comparison of NTA yields of retrieved N‐termini in plastid proteins. Similar variation as in panel (D) was observed when NTA of only plastid proteins was analyzed. Error bars are ± SD (see details of sampling in panel C).
Volcano plot representing NTA analyses of nsi knockout lines (treated together) and wild type. For this analysis, the P‐value was calculated using Excel's two‐tailed t‐test function, for two‐sample with equal variance. The most impacted proteins are shown in green. See Table 2 for correspondence. N is related to the number of quantified N‐termini.
IceLogo representation (Colaert et al, 2009) of the protein N‐termini with modified NTA yield vs proteins with unmodified NTA yield. The color symbol associated with each residue is detailed in the legend to Fig 4B. Black is aliphatic; green is small hydrophilic.
Crystal structure of the catalytic subunit of NatA from yeast complexed with a bisubstrate inhibitor (PDB code 4KVM). The interaction of the α1α2 and β6β7 loops restricts access to the binding site of the peptide promoting N‐α acetylation of substrate peptide.
Crystal structure of the HAT domain of Tetrahymena GCN5 bound with both peptide ligand and CoA (PDB code 1QSN). The absence of a β7 strand and the different C terminus creates an accessible groove at the surface of the HAT protein promoting internal N‐ε lysine acetylation of the substrate peptide. The main chain (top) and solvent accessibility (bottom) of both proteins are displayed as gray ribbon or gray surface, respectively). α1α2 loop, β4 strand, β5 strand are colored in red, yellow, and pink. β6β7 loop in A and β6α4 loop in B are colored in cyan. Peptides and the CoA moiety are displayed as green and orange sticks, respectively.
Models of GNAT4 obtained from structure homology‐modeling server SWISS‐model using the pdb codes of two GCN5‐related N‐Acetyltransferases (2 × 7b, left/model 1; 4H89, right/model 2) as template. The GNAT4 models show different α1α2 and β6β7 loop conformation, suggesting a mobile loop allowing GNAT4 to ensure a KA/NTA dual activity.
References
-
- Aebersold R, Mann M (2016) Mass‐spectrometric exploration of proteome structure and function. Nature 537: 347–355 - PubMed
-
- Agoni V (2015) Could rare amino acids regulate enzymes abundance? bioRxiv 10.1101/021295 [PREPRINT] - DOI
-
- Aksnes H, Drazic A, Marie M, Arnesen T (2016) First things first: vital protein marks by N‐terminal acetyltransferases. Trends Biochem Sci 41: 746–760 - PubMed
