Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Feb;146(2):351-67.
doi: 10.1104/pp.107.111393. Epub 2007 Dec 21.

Evolutionary radiation pattern of novel protein phosphatases revealed by analysis of protein data from the completely sequenced genomes of humans, green algae, and higher plants

Affiliations

Evolutionary radiation pattern of novel protein phosphatases revealed by analysis of protein data from the completely sequenced genomes of humans, green algae, and higher plants

David Kerk et al. Plant Physiol. 2008 Feb.

Abstract

In addition to the major serine/threonine-specific phosphoprotein phosphatase, Mg(2+)-dependent phosphoprotein phosphatase, and protein tyrosine phosphatase families, there are novel protein phosphatases, including enzymes with aspartic acid-based catalysis and subfamilies of protein tyrosine phosphatases, whose evolutionary history and representation in plants is poorly characterized. We have searched the protein data sets encoded by the well-finished nuclear genomes of the higher plants Arabidopsis (Arabidopsis thaliana) and Oryza sativa, and the latest draft data sets from the tree Populus trichocarpa and the green algae Chlamydomonas reinhardtii and Ostreococcus tauri, for homologs to several classes of novel protein phosphatases. The Arabidopsis proteins, in combination with previously published data, provide a complete inventory of known types of protein phosphatases in this organism. Phylogenetic analysis of these proteins reveals a pattern of evolution where a diverse set of protein phosphatases was present early in the history of eukaryotes, and the division of plant and animal evolution resulted in two distinct sets of protein phosphatases. The green algae occupy an intermediate position, and show similarity to both plants and animals, depending on the protein. Of specific interest are the lack of cell division cycle (CDC) phosphatases CDC25 and CDC14, and the seeming adaptation of CDC14 as a protein interaction domain in higher plants. In addition, there is a dramatic increase in proteins containing RNA polymerase C-terminal domain phosphatase-like catalytic domains in the higher plants. Expression analysis of Arabidopsis phosphatase genes differentially amplified in plants (specifically the C-terminal domain phosphatase-like phosphatases) shows patterns of tissue-specific expression with a statistically significant number of correlated genes encoding putative signal transduction proteins.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Phylogenetic tree of CDC14-like sequence relationships. A rectangular cladogram was generated by comparing catalytic domains of CDC14-like proteins (red) with the closest relatives in plants (blue), using the set of Arabidopsis DSP proteins as an outgroup (black; from Kerk et al., 2006). Proteins included are from the following organisms, with the source of the sequences in parentheses: Arabidopsis (MIPS code without “t”); C. reinhardtii (Crexxxxxx, where xxxxxx is the protein identification from http://plantsp.genomics.purdue.edu/plantsp/data/proteins.Chlre3.fasta); humans (CDC14A_Hs:NP_003663, CDC14B_Hs:NP_201588); O. sativa (MIPS code); O. tauri (MIPS codes given by https://bioinformatics.psb.ugent.be/gdb/ostreococcus/); P. trichocarpa (Popxxxxxx, where xxxxxx is the protein identification from the U.S. Department of Energy Joint Genome Institute [DOE JGI]); X. laevis (CDC14A_Xl:NP_001084450, CDC14B_Xl:NP_001084486). Multiple sequence alignment construction and phylogenetic tree inference was performed as detailed in “Materials and Methods”. The tree topology shown is that from NJ, where 1,000 replicates were performed. The CDC14 proteins (red) form a clade (node A: 100% NJ; 98.8% Pars; 78.2% ML) that is distinct from the clade formed by the most closely related plant proteins (blue, node B: 99.1% NJ; 98.4% Pars; 82.7% ML). This suggests distinct function, which is discussed in the text. These two groups are related to the exclusion of the set of Arabidopsis DSP proteins (node C: 88.6% NJ; 95% Pars; 40.2% ML). All other nodes in the tree figure show replicate support from NJ only.
Figure 2.
Figure 2.
Phylogenetic tree of CDC25-like sequence relationships. A rectangular cladogram was generated by comparing catalytic domains of CDC25-like proteins (red) with the closest relatives in plants and fungi (blue). Proteins included are from the following organisms, with the source of the sequences in parentheses: Arabidopsis (MIPS code without “t”); C. reinhardtii (Crexxxxxx, where xxxxxx is the protein identification from http://plantsp.genomics.purdue.edu/plantsp/data/proteins.Chlre3.fasta); Danio rerio (Drxxxxxxxx, where xxxxxxxx is the gi); humans (CDC25A_Hu:NP_001780, CDC25B_Hu:NP_068659, CDC25C_Hu:NP_001781); L. major (LmACR2: GenBank AAS73185); O. sativa (MIPS code without “s”); O. tauri (MIPS codes given from https://bioinformatics.psb.ugent.be/gdb/ostreococcus/); P. trichocarpa (Popxxxxxx, where xxxxxx is the protein identification from DOE JGI); fern (PvACR2: GenBank ABC26900); S. cerevisiae (ScCDC25, NP_013750; ScACR2, NP_015526); S. pombe (SpCDC25, NP_592947; SpACR2, NP_595247); X. laevis (Xlexxxxxx, where xxxxxxxx is the gi, CDC25A_Xle:NP_001081257). Multiple sequence alignment construction and phylogenetic tree inference was performed as detailed in “Materials and Methods”. The tree topology shown is that from ML, where 10,000 replicates were performed. The known CDC25 proteins (red) form a clade with the sequence from O. tauri (node A: 100% NJ; 97.8% Pars; 75.9% ML), whereas the most closely related plant proteins cluster with the arsenate reductases (blue; see text for details; node B).
Figure 3.
Figure 3.
Phylogenetic tree of FCP1-like sequence relationships. A rectangular cladogram was generated by comparison of catalytic domains from FCP/SCP catalytic domain-containing proteins from the following species: Arabidopsis (MIPS code without “t”, with the following exceptions, CPL3_Ath:At2g33540, CPL4_Ath:At5g58003); C. reinhardtii (Crexxxxxx, where xxxxxx is the protein identification from http://plantsp.genomics.purdue.edu/plantsp/data/proteins.Chlre3.fasta); D. rerio (Drxxxxxxxx, where xxxxxxxx is the gi, except Dullard_Dr:NP_001007310); humans (SCP1_Hu:NP_067021, SCP2_Hu:NP_005721, SCP3_Hu:NP_001008393, DULLARD_Hu:NP_056158, Hu6841480:AAF29093, FCP1_Hu:NP_004706, MGC10067Hu:NP_659486 [also known as UBLCP1], TIM50Hu:NP_001001563); O. sativa (MIPS code without “s” with the following exceptions: OsCPL3:Os11g31890, OsCPL4:Os05g32430); O. tauri (MIPS codes given from https://bioinformatics.psb.ugent.be/gdb/ostreococcus/); P. trichocarpa (Popxxxxxx, where xxxxxx is the protein identification from DOE JGI); S. pombe (FCP1_Spombe:NP_594768); X. laevis (FCP1_Xle:NP_001081726); Xenopus tropicalis (Xtxxxxxxxx, where xxxxxxxx is the gi, with the following exceptions from Ensembl: 39992_Xtr:ENSXETP00000039992, 32705_Xtr:ENSXETP00000032705). Multiple sequence alignment construction and phylogenetic tree inference was performed as detailed in “Materials and Methods”. The tree topology shown is that from NJ, where 1,000 replicates were performed. The proteins segregate into 10 subclusters, which are labeled, color coded, and discussed in the text. The support for each of the labeled nodes is as follows: node A (99.7% NJ; 44.8% Pars; 34.7% ML); node B (96.7% NJ; 48.8% Pars; 70.1% ML); node C (100% NJ; 100% Pars; 98.4% ML); node D (99.4% NJ; 91.8% Pars; 80.6% ML); node E (99.3% NJ; 31.6% Pars; 60.4% ML); node F (99.7% NJ; 31.2% Pars; 79.4% ML); node G (82.0% NJ; 88.8% Pars; 74.7% ML); node H (64.1% NJ; 67.6% Pars; 43.7% ML; sequences Cre187332, Cre149314, and Pop560900 are missing in the Pars tree); node I (99.5% NJ; 60.4% Pars; 68.1% ML); node J (100% NJ; 99.0% Pars; 58.2% ML).

References

    1. Alonso A, Sasin J, Bottini N, Friedberg I, Friedberg I, Osterman A, Godzik A, Hunter T, Dixon J, Mustelin T (2004) Protein tyrosine phosphatases in the human genome. Cell 117 699–711 - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25 3389–3402 - PMC - PubMed
    1. Archambault J, Chambers RS, Kobor MS, Ho Y, Cartier M, Bolotin D, Andrews B, Kane CM, Greenblatt J (1997) An essential component of a C-terminal domain phosphatase that interacts with transcription factor IIF in Saccharomyces cerevisiae. Proc Natl Acad Sci USA 94 14300–14305 - PMC - PubMed
    1. Bailey TL, Gribskov M (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14 48–54 - PubMed
    1. Bailey TL, Williams N, Misleh C, Li WW (2006) MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34 W369–373 - PMC - PubMed

Publication types

Substances