Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 18:1:100005.
doi: 10.1016/j.bbadva.2021.100005. eCollection 2021.

Origin of the Phosphoprotein Phosphatase (PPP) sequence family in Bacteria: Critical ancestral sequence changes, radiation patterns and substrate binding features

Affiliations

Origin of the Phosphoprotein Phosphatase (PPP) sequence family in Bacteria: Critical ancestral sequence changes, radiation patterns and substrate binding features

David Kerk et al. BBA Adv. .

Abstract

Background: Phosphoprotein phosphatases (PPP) belong to the PPP Sequence family, which in turn belongs to the broader metallophosphoesterase (MPE) superfamily. The relationship between the PPP Sequence family and other members of the MPE superfamily remains unresolved, in particular what transitions took place in an ancestral MPE to ultimately produce the phosphoprotein specific phosphatases (PPPs).

Methods: We use structural and sequence alignment data, phylogenetic tree analysis, sequence signature (Weblogo) analysis, in silico protein-peptide modeling data, and in silico mutagenesis to trace a likely route of evolution from MPEs to the PPP Sequence family. Hidden Markov Model (HMM) based iterative database search strategies were utilized to identify PPP Sequence Family members from numerous bacterial groups.

Results: Using Mre11 as proxy for an ancestral nuclease-like MPE we trace a possible evolutionary route that alters a single active site substrate binding His-residue to yield a new substrate binding accessory, the "2-Arg-Clamp". The 2-Arg-Clamp is not found in MPEs, but is present in all PPP Sequence family members, where the phosphomonesterase reaction predominates. Variation in position of the clamp arginines and a supplemental sequence loop likely provide substrate specificity for each PPP Sequence family group.

Conclusions: Loss of a key substrate binding His-in MPEs opened the path to bind novel substrates and evolution of the 2-Arg-Clamp, a sequence change seen in both bacterial and eukaryotic phosphoprotein phosphatases.General significance: We establish a likely evolutionary route from nuclease-like MPE to PPP Sequence family enzymes, that includes the phosphoprotein phosphatases.

Keywords: Bacterial origin; Metallophosphoesterase; Molecular dynamics simulation; Phosphomonoesterase; Phylogenetic analysis; Protein phosphatase.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Image, graphical abstract
Graphical abstract
Fig. 1
Fig. 1
Schematic diagrams showing the reactions catalyzed by phosphomonoesterase, phosphodiesterase and phosphoric acid anhydride hydrolase enzymes. The approximate spatial arrangement of (A) phosphoric acid diester or phosphoric acid anhydride substrates (blue lines and letters) and (B) phosphoric acid monoester substrates (green lines and letters) bound to the divalent metal ions (M1 and M2), as well as key enzyme residues, water molecule and the activated hydroxide ions at the active sites are shown. The atoms in the side chains from the highly conserved residues of metallophosphoesterase (MPE) and PPP sequence family enzymes coordinating to the divalent metal ions are coloured according to the highly conserved sequence motifs 1–5 and discussed in the text. HisB refers to the second His-of motif 5 of MPEs (GHxH). Curved arrows indicate the flow of electrons occurring during the nucleophilic substitution reactions. Dashed lines indicate metal ion coordination bonds.
Fig. 2:
Fig. 2
Structural models of PPP enzymes with phosphorylated substrate peptides. Panel A: A combination of molecular docking and MD simulation, as detailed in Methods, was used to refine a structural model of the phage lambda phosphatase (PDB ID:1G5B) and a peptide containing the sequence “RRA(pT)VA”. A stable conformation of the peptide was first produced by MD simulation. The protein binding pocket is shown in a cartoon mode (gray), while the phosphorylated residue is shown in a stick mode. The metals present in the active site (“M1” and “M2”) are shown in CPK mode (green spheres). In the model these are Zn2+ (see Methods). The metal coordination interactions are illustrated by the red dashed lines. Protein-substrate interactions are shown as yellow dashed lines. The side chains of residues involved in the “2-Arginine Clamp” discussed in the text (Arg53 and Arg162) are shown in a stick mode for clarity and emphasis. Panel B: Structural model of human protein phosphatase 1 (PP1) (PDB ID:3E7B) and a phopsho-peptide containing the sequence “11KQIpSVRG17” derived from the classic PP1 substrate human glycogen phosphorylase-a (PDB ID: 1Z8D). The protein is shown in a cartoon mode (pink), the phosphorylated residue in stick mode. The metals within the active site (“M1” and “M2”) are shown as CPK spheres (yellow). In the model these are Zn2+ (see Methods). Metal coordination interactions are illustrated by dashed lines. Protein-substrate interactions are shown as yellow dashed lines. The side chains of residues involved in the “2-Arginine Clamp” discussed in the text (Arg96 and Arg221) are represented as sticks for clarity and emphasis. Panel C: Superposition of structural models for phage lambda phosphatase (1G5B, gray) and human protein phosphatase PP1 (3E7B, pink) with their respective substrate peptides.
Fig. 3:
Fig. 3
Sequence, motifs and models of Mre11 with substrate dAdA. The amino acid sequence of P. furiosus Mre11 is shown above the two panels. Key amino acids from each of the 5 MPE motifs is indicated and corresponds to the residues shown in panels A and B. Panel (A): Mre11:dAdA complex was generated through homology modeling by using the structure of Mre11 with bound dAMP (PDB ID: 1II7) as a starting model (see Methods). Metal ions (“M1” and “M2”) are represented in CPK sphere mode. These ions were Zn2+ in the modeling process (see Methods). Protein-metal ion coordination interactions are shown as dashed gray lines. Key hydrogen bonds between the protein (side chain of His85 in Motif 3 and backbone carbonyl of His208 in Motif 5) and bound ligand are shown as yellow dashed lines. A view rotated 60° with respect to this image is shown in Panel (B). The terminal nucleotide (“n”) corresponds to the dAMP in the solved structure complex with the nuclease Mre11 (PDB ID:1II7). The penultimate nucleotide (“n + 1”) corresponds to the upstream portion of a linear polynucleotide substrate.
Fig. 4:
Fig. 4
Effects of residue substitution variants for His208 in the complex of Mre11 with a deoxyadenosine dinucleotide (dAdA) ligand. Mre11:dAdA complex was generated through homology modeling by using the structure of Mre11 with bound dAMP (PDB ID: 1II7) as a starting model (see Methods). (Fig. 3). This Figure illustrates a subset of stable structures observed for His208 substitutions listed in Supplemental Table S3 in comparison to the original structure featuring His208. Metal ions (”M1” and “M2”) are represented as gray spheres. These ions were Zn2+ in the modeling process (see Methods). Protein-metal ion coordination interactions are represented as dashed gray lines. The position of a water molecule coordinating Metal 1 is shown by a red sphere and labeled as “W”. Key hydrogen bond interactions between the protein and ligand are represented as yellow dashed lines. In Panel C the H-bond between amino acid 208 and the substrate is lost between the dAdA ligand and the substituted Pro-side chain.
Fig. 5:
Fig. 5
Conservation of Potential Substrate-Binding “Clamp” Residues in PPP Family Member Groups. Candidate sequences were harvested for each PPP Family member group by HMM (Hidden Markov Model) based database searching, as detailed in Methods. Sequences were aligned as detailed in Methods. Full-length alignments were edited to comprise two alignments per PPP Family member type, one centered at Motif 2 and the other between Motif 4 and Motif 5. Residue conservation in these alignments was then visualized by construction of sequence WebLogos, as detailed in Methods. The y-axis depicts the degree of conservation in “bits” of information. Taller characters indicate higher conservation. The metal-binding His-residues in Motif 4 and Motif 5 are 100% conserved. The red triangles above the sequences designate potential substrate-binding conserved clamp residues. The percent conservation of these residues is as follows: SLPR1: First “Clamp” Residue (R10 [99.9]); Second “Clamp” Residue (R43 [99.4]; R47 [86.4]). SLPR2: First “Clamp” Residue (K10 [99.7]); Second “Clamp” Residue (R30 [100]; K32 [98.7]; R34 [99.3]) RLPH: First “Clamp” Residue (R10 [90.7]); Second “Clamp” Residue [Euk]: K18 [63.9], R18 [36.1]; R22 [100] Second “Clamp” Residue [BactSubPop1]: R18 [75.6], K18 [22.2]; K22 [75.0] Second “Clamp” Residue [BactSubPop2]: R18 [86.2] PrpA_B (cd07424): First “Clamp” Residue (R10 [98.5]); Second “Clamp” Residue (R19 [91.2]) ApaH (cd07422): First “Clamp” Residue (R10 [99.1]); Second “Clamp” Residue (R53 [98.9]; R55 [99.7]) PrpLike (cd07423): First “Clamp” Residue (R9 [91.3]); Second “Clamp” Residue (R41 [95.9]) PA3087 (cd07413): First “Clamp” Residue (R9 [95.2]); Second “Clamp” Residue (R55 [97.4]; R59 [99.7]) The relationship between SLP Classic, SLPR1 and SLPR2 is illustrated in the sequence alignment presented as Supplemental Figure S7. The relationship between SLP Classic and all the other PPP Family member groups is illustrated in the structure-guided alignment presented as Supplemental Figure S4. The individual alignments on which these WebLogos are based (in FASTA format) are presented as Supplemental Files S8 – S25.
Fig. 6:
Fig. 6
PPP Sequence Family Members – Graphical Summary Taxonomic Distribution. This Figure depicts in a graphical format the distribution of PPP Sequence Family member sequences in Bacteria. A red dot denotes a taxonomic group which contains a PPP Sequence Family member. These are generally phylum-level groups with a few exceptions, such as the Proteobacteria, which are broken down into classes. The PPP Sequence Family member groups described in this report are numbered (1–8) and their primary distribution (i.e. taxonomic group with the highest number of sequences for each) is indicated. Colored symbols denote the taxonomic distribution of sequence variants, which are described in the text. Summary information for the phylum-level representation of the various PPP Sequence Family member groups is presented in Supplemental Table S5. Detailed pie-chart and bar-graph summary information on taxonomic distribution of the various sequence groups is presented in Supplemental Figure S9. The present Figure is adapted from Fig. 1 in the published report of Hug et al. 2016. That material is licensed under a Creative Commons Attribution 4.0. International License.
Fig. 7
Fig. 7
Structures of PPP Sequence family members identify accessory loops and second clamp arginine residues. Structures are presented to confirm the identity of the downstream arginine between Motif 4 and Motif 5 of the “2-Arginine Clamp” where multiple arginines could play this role. The identity of this arginine is confirmed for (D) ApaH (2DFJ), (E) PnkP (4J6O) and (F) PA3087 and is shown with the green square (the first arginine of the clamp is shown with a blue square). Superimposing each structure on lambda phosphatase (shown alone in (G)) revealed sequence inserts between Motif 4 and Motif 5 (D-F, H; dashed ovals) that are absent in (A) RLPH ‘BactSubPop2, (B) AtRLPH2, (C) CAPTP ’SLPClassic, and (G) the lambda phosphatase. Each was superimposed onto the structure of the lambda phosphatase by minimizing the least-squares differences in the coordinates for the conserved residues coordinating to the divalent metal ions. The structure of the RLPH-family phosphatase (RLPH BactSubPop2; UniprotKB accession number A0A529C6H2) was generated using the coordinates of the lambda phosphatase (1G5B) as a template for homology modeling using Modeller (v.9.24). The structure of the phosphatase from P. aeruginosa, PA3087 (UniprotKB accession number Q9HZC1) was generated using the coordinates of the diadenosine tetraphosphate hydrolase from Shigella flexneri 2a (2DFJ) as a template for homology modeling using Modeller (v. 9.24). Metal ions are shown as coloured spheres. PDB ID codes are shown brackets.
Fig. 8
Fig. 8
Substrate binding residues in an ancestral nuclease MPE and derived PPP sequence family members. The figure shows the solved structure of an ancestral nuclease MPE (Mre11 [1II7]) and solved structures or homology model for derived PPP sequence family members, all in equivalent orientation. The ancestral substrate binding pattern in Mre11 features use of the His-residue of Motif 3 and HisB in Motif 5. In the derivative PPP sequence family members HisB has been substituted with other residues (the most frequent being proline) which has apparently allowed enzyme adaptation to bind alternative substrates [see also Fig. 4]. This adaptation has encompassed a diversification of enzymatic activities from the ancestral phosphodiesterase to the addition of phosphoric acid anhydride hydrolase and phosphomonoesterase (including protein phosphatase) activities. Substrate binding in the derived PPP sequence family structures retains the ancestral Motif 3 His-residue but replaces the lost Motif 5 HisB with a “2-Arginine Clamp” featuring a conserved Arg-in Motif 2 and a downstream Arg-between Motif 4 and Motif 5. The dashed arrow between Mre11 and PP1 reflects the occurrence of a distinct evolutionary pathway for the derivation of the closely related set of eukaryotic PPPs (PP1, PP2A/4/6, PP2B, PP5, PP7), which will be the subject of a future communication from our group. Dashed red lines indicate ion-coordination bonds, and key catalytic and metal ion-coordinating residues are drawn in stick representation. Metal ions are shown as semi-transparent gray spheres. PDB ID codes are shown brackets.

Similar articles

Cited by

References

    1. Cohen P. The regulation of protein function by multisite phosphorylation–a 25 year update. Trends Biochem Sci. 2000;25(12):596–601. - PubMed
    1. Olsen J.V., Mann M. Status of large-scale analysis of post-translational modifications by mass spectrometry. Molecular & cellular proteomics: MCP. 2013;12(12):3444–3452. - PMC - PubMed
    1. Sharma K., D'Souza R.C., Tyanova S., Schaab C., Wisniewski J.R., Cox J., Mann M. Ultradeep human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based signaling. Cell Rep. 2014;8(5):1583–1594. - PubMed
    1. Manning G., Whyte D.B., Martinez R., Hunter T., Sudarsanam S. The protein kinase complement of the human genome. Science. 2002;298(5600):1912–1934. - PubMed
    1. Chen M.J., Dixon J.E., Manning G. Genomics and evolution of protein phosphatases. Sci Signal. 2017;10(474) - PubMed

LinkOut - more resources