Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Aug 4;10(8):e0132863.
doi: 10.1371/journal.pone.0132863. eCollection 2015.

"PP2C7s", Genes Most Highly Elaborated in Photosynthetic Organisms, Reveal the Bacterial Origin and Stepwise Evolution of PPM/PP2C Protein Phosphatases

Affiliations
Comparative Study

"PP2C7s", Genes Most Highly Elaborated in Photosynthetic Organisms, Reveal the Bacterial Origin and Stepwise Evolution of PPM/PP2C Protein Phosphatases

David Kerk et al. PLoS One. .

Abstract

Mg+2/Mn+2-dependent type 2C protein phosphatases (PP2Cs) are ubiquitous in eukaryotes, mediating diverse cellular signaling processes through metal ion catalyzed dephosphorylation of target proteins. We have identified a distinct PP2C sequence class ("PP2C7s") which is nearly universally distributed in Eukaryotes, and therefore apparently ancient. PP2C7s are by far most prominent and diverse in plants and green algae. Combining phylogenetic analysis, subcellular localization predictions, and a distillation of publically available gene expression data, we have traced the evolutionary trajectory of this gene family in photosynthetic eukaryotes, demonstrating two major sequence assemblages featuring a succession of increasingly derived sub-clades. These display predominant expression moving from an ancestral pattern in photosynthetic tissues toward non-photosynthetic, specialized and reproductive structures. Gene co-expression network composition strongly suggests a shifting pattern of PP2C7 gene functions, including possible regulation of starch metabolism for one homologue set in Arabidopsis and rice. Distinct plant PP2C7 sub-clades demonstrate novel amino terminal protein sequences upon motif analysis, consistent with a shifting pattern of regulation of protein function. More broadly, neither the major events in PP2C sequence evolution, nor the origin of the diversity of metal binding characteristics currently observed in different PP2C lineages, are clearly understood. Identification of the PP2C7 sequence clade has allowed us to provide a better understanding of both of these issues. Phylogenetic analysis and sequence comparisons using Hidden Markov Models strongly suggest that PP2Cs originated in Bacteria (Group II PP2C sequences), entered Eukaryotes through the ancestral mitochondrial endosymbiosis, elaborated in Eukaryotes, then re-entered Bacteria through an inter-domain gene transfer, ultimately producing bacterial Group I PP2C sequences. A key evolutionary event, occurring first in ancient Eukaryotes, was the acquisition of a conserved aspartate in classic Motif 5. This has been inherited subsequently by PP2C7s, eukaryotic PP2Cs and bacterial Group I PP2Cs, where it is crucial to the formation of a third metal binding pocket, and catalysis.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Phylogenetic orthogonal tree depicting major divergence points in the evolutionary history of modern PP2C sequences.
A rooted phylogenetic tree was inferred using BEAST analysis, as detailed in “Materials and Methods”. Two independent chains were run from the same input file for 50 million cycles, collecting 10,000 trees each from the posterior distribution. A “burn-in” of 1,000 trees was discarded from each sample and the remaining trees pooled manually. The log files for the two runs were combined, and are available as supporting information (S4 File). In this figure the root is located in the upper left of the image. Depicted is the maximum clade credibility tree from the posterior distribution tree sample. Nodes display the 95% high posterior density interval in blue. Each branch is labeled with the posterior probability (max = 1.0). Point 1, Point 2 and Point 3 are discussed in the text. This tree is based on the amino acid sequence alignment presented in Fig A in S2 File, Panel 1.
Fig 2
Fig 2. Topological uncertainty in the phylogenetic tree summarizing PP2C sequence evolutionary history.
This tree is an alternate display of the same BEAST analysis data used to generate Fig 1. Green lines represent traces of individual trees from amongst the posterior distribution tree sample. In blue is the consensus tree with the highest clade support (“root canal”). Points 1, 2, and 3 are discussed in the text. Each represents a node with a black bar indicating the 95% high posterior density interval.
Fig 3
Fig 3. Phylogenetic orthogonal tree depicting interrelationships between representative PP2C7 sequences from plants, green algae, and fungi.
Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” A typical example is shown. The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. Predicted in silico subcellular localizations are represented as follows: Ch, chloroplast; Cy, cytosol; M, mitochondria; S, signal peptide; Unk (unknown), sequence fragment lacking native amino terminus. Sequences used in phylogenetic tree generation are listed in Table A in S1 File, while compiled in silico subcellular localization data can be found in Table B in S1 File (non-photosynthetic organisms) and Table F in S1 File (photosynthetic organisms). * = Three algal sequences included in this cluster.
Fig 4
Fig 4. Phylogenetic radial tree depicting interrelationships between PP2C7 sequences and bacterial Group II PP2C sequences.
The PP2C7 set is a very large and diverse one (238 sequences) and the bacterial Group II sequences are of the “GN” type, from the “More RsbX-Like” assemblage (144 sequences) (sequence varieties described in the text). For this analysis the sequences of the Myxococcales group have been removed (A9GSF9, A6FYN9, A9GWA1, E3FWN8, L7U8R0) (see text for rationale). Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], and PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. Preliminary analyses showed that attainment of consistent tree topologies between the different inference methods required removal of the following sequences: Q2RIF7, H1Z3D3, C9R9C1, B1I2G2, G2MXY4, Q8RAY2, E4Q3X2. The cluster of sequences from α-Proteobacteria is indicated. The approximate location in the tree of the reference sequence BsP17906 (RsbX_BACSU) is indicated. This tree is based on the amino acid sequence alignment presented in S1(B) Fig. * = αProteobacteria cluster separated into adjacent fragments in this tree.
Fig 5
Fig 5. Phylogenetic radial tree depicting interrelationships between eukaryotic PP2C sequences and bacterial Group II PP2C sequences.
The eukaryotic PP2C set consists of the combined sequences (excluding PP2C7s) from Arabidopsis and human (96 sequences total—HsTAB1 excluded). The bacterial Group II sequences are of the “GN” type, including the “More RsbX-Like” and “Less RsbX-Like” assemblages (328 sequences total) (see text for explanation of sequence varieties). Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], and PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. The cluster of sequences from αβγ-Proteobacteria is indicated. “Myxo” designates sequences from the Myxococcales (δ-Proteobacteria). The approximate location in the tree of the reference sequence BsP17906 (RsbX_BACSU) is indicated. This tree is based on the amino acid sequence alignment presented in Fig A in S2 File, Panel 3. * = Myxococcales unresolved from other sequences in this tree.
Fig 6
Fig 6. Phylogenetic radial tree depicting a large scale comparison between PP2C7 sequences, bacterial Group II sequences, bacterial Group I sequences and eukaryotic PP2C sequences.
There are 102 representative PP2C7 sequences, and 49 representative bacterial Group I sequences. The eukaryotic PP2C set consists of the combined sequences (excluding PP2C7s) from Arabidopsis and human (96 sequences total—HsTAB1 excluded). The bacterial Group II sequences include representatives from both “Bulk” (50 sequences) and “GN” types (38 sequences) (see text for explanation of sequence varieties). There are also 9 “Eukaryotic-Like” bacterial PP2C sequences (see text for explanation). Inference of unrooted phylogenetic trees was performed as outlined in “Materials and Methods.” The most crucial nodes are labeled. Node support values with the four inference methods (PhyML [aBayes], RAxML [RBS], MrBayes [PP], and PhyloBayes_MPI [PP]) are tabulated in the Figure, separated by slashes (“/”). Support values for all trees are summarized in Table N in S1 File. The cluster of “Eukaryotic-Like” bacterial PP2C sequences within the eukaryotic PP2C clade is indicated by colored lines. This tree is based on the amino acid sequence alignment presented in Fig A in S2 File, Panel 4.
Fig 7
Fig 7. Structure-guided alignment of bacterial Group II, PP2C7, bacterial Group I, and eukaryotic PP2C sequences.
Information from solved structures of bacterial Group II, bacterial Group I, and eukaryotic PP2Cs (indicated by their four-character PDB codes) was used to guide this alignment, as detailed in “Materials and Methods”. Above the sequences are shown conserved beta-strand and α-helical secondary structure elements. “Box” refers to a more variable region in multiple solved structures. For a secondary structure diagram, including element numbering, see Fig F in S2 File. Sequence motifs are as given in [84]. Universally conserved aspartates involved in metal coordination are given in red. Aspartates conserved in some but not all sequences are given in purple and orange (see text for discussion). The inset shows a simplified phylogenetic tree, with the proposed evolutionary advent of critical aspartate residues indicated. See Table A in S1 File for a listing of PP2C7 sequences. Bacterial Group II sequences without solved structures are from UniProt. Species for sequences are as follows: Bs (Bacillus subtilis); Ssp (Synechocystis sp.); Pa (Pseudomonas aeruginosa); Mt (Moorella thermoacetica); Tb (Trypanosoma brucei); Lm (Leishmania major); Tt (Tetrahymena thermophila); Ppa (Physcomitrella patens); At (Arabidopsis thaliana); Cr (Chlamydomonas reinhardtii); Vc (Volvox carteri); Ps (Phytophthora sojae); Ng (Naegleria gruberi); Xl (Xenopus laevis); Hs (Homo sapiens); Dm (Drosophila melanogaster); Fg (Fusarium graminearum); An (Aspergillus niger); Sc (Saccharomyces cerevisiae); Mtu (Mycobacterium tuberculosis); Sa (Streptococcus agalactiae); Ms (Mycobacterium smegmatis); Te (Thermosynechococcus elongatus); Ag (Anopheles gambiae).

Similar articles

Cited by

References

    1. Brautigan DL. Protein Ser/Thr phosphatases—the ugly ducklings of cell signalling. The FEBS journal. 2013;280(2):324–45. 10.1111/j.1742-4658.2012.08609.x . - DOI - PubMed
    1. Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML, Jensen LJ, et al. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Science signaling. 2010;3(104):ra3 10.1126/scisignal.2000475 . - DOI - PubMed
    1. Das AK, Helps NR, Cohen PT, Barford D. Crystal structure of the protein serine/threonine phosphatase 2C at 2.0 A resolution. The EMBO journal. 1996;15(24):6798–809. - PMC - PubMed
    1. Uhrig RG, Labandera AM, Moorhead GB. Arabidopsis PPP family of serine/threonine protein phosphatases: many targets but few engines. Trends in plant science. 2013;18(9):505–13. 10.1016/j.tplants.2013.05.004 . - DOI - PubMed
    1. Moorhead GB, De Wever V, Templeton G, Kerk D. Evolution of protein phosphatases in plants and animals. The Biochemical journal. 2009;417(2):401–9. 10.1042/BJ20081986 . - DOI - PubMed

Publication types

MeSH terms