Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Dec 1;30(23):5229-43.
doi: 10.1093/nar/gkf645.

Detection of novel members, structure-function analysis and evolutionary classification of the 2H phosphoesterase superfamily

Affiliations

Detection of novel members, structure-function analysis and evolutionary classification of the 2H phosphoesterase superfamily

Raja Mazumder et al. Nucleic Acids Res. .

Abstract

2',3' Cyclic nucleotide phosphodiesterases are enzymes that catalyze at least two distinct steps in the splicing of tRNA introns in eukaryotes. Recently, the biochemistry and structure of these enzymes, from yeast and the plant Arabidopsis thaliana, have been extensively studied. They were found to share a common active site, characterized by two conserved histidines, with the bacterial tRNA-ligating enzyme LigT and the vertebrate myelin-associated 2',3' phosphodiesterases. Using sensitive sequence profile analysis methods, we show that these enzymes define a large superfamily of predicted phosphoesterases with two conserved histidines (hence 2H phosphoesterase superfamily). We identify several new families of 2H phosphoesterases and present a complete evolutionary classification of this superfamily. We also carry out a structure- function analysis of these proteins and present evidence for diverse interactions for different families, within this superfamily, with RNA substrates and protein partners. In particular, we show that eukaryotes contain two ancient families of these proteins that might be involved in RNA processing, transcriptional co-activation and post-transcriptional gene silencing. Another eukaryotic family restricted to vertebrates and insects is combined with UBA and SH3 domains suggesting a role in signal transduction. We detect these phosphoesterase modules in polyproteins of certain retroviruses, rotaviruses and coronaviruses, where they could function in capping and processing of viral RNAs. Furthermore, we present evidence for multiple families of 2H phosphoesterases in bacteria, which might be involved in the processing of small molecules with the 2',3' cyclic phosphoester linkages. The evolutionary analysis suggests that the 2H domain emerged through a duplication of a simple structural unit containing a single catalytic histidine prior to the last common ancestor of all life forms. Initially, this domain appears to have been involved in RNA processing and it appears to have been recruited to perform various other functions in later stages of evolution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(Previous two pages) Multiple alignment of a selected set of 2H domains. Proteins are represented by their gene names, species abbreviations and gi numbers. The 85% consensus shown below the alignment was based on the following amino acid classes: h, hydrophobic residues (L,I,Y,F,M,W,A,C,V) and l, aliphatic (L,I,A,V) residues shaded yellow; o, alcohol (S,T) group containing residues, shaded blue. The secondary structure of the Arabidopsis Appr>p cyclic phosphodiesterase is shown above the alignment, where H denotes residues present in helices and E (extended) in strands. Family specific groupings are shown to the right of the alignment. Species abbreviations are as follows: Aae, Aquifex aeolicus; Af, A.fulgidus; Ana, Anabaena sp.PCC 7120; Ap, A.pernix; ARV, Avian rotavirus; At, Arabidopsis thaliana; Atu, Agrobacterium tumefaciens; Bmel, Brucella melitensis; Bs, B.subtilis; Bst, Bacillus stearothermophilus; BV, Berne virus; Ca, Carassius auratus; Cac, Clostridium acetobutylicum; Ccr, Caulobacter crescentus; Ce, Caenorhabditis elegans; Cgl, C.glutamicum; CIV, Chilo iridescent virus; Ddi, D.discoideum; Dm, Drosophila melanogaster; Drad, Deinococcus radiodurans; Ec, E.coli; Feac, Ferroplasma acidarmanus; FPV, Fowlpox virus; HCoV, Human coronavirus; HRV, Human rotavirus; Hs, Homo sapiens; MHV, Mouse hepatitis virus; Mj, M.jannaschii; Mkan, M.kandleri; Mlo, M.loti; Mm, Mus musculus; Mma, M.mazei Goe1; Mta, M.thermautotrophicus; Mtu, Mycobacterium tuberculosis; Pa, P.abyssi; Ph, Pyrococcus horikoshii; Psa, P.aeruginosa; Rsol, Ralstonia solanacearum; Sa, Staphylococcus aureus; Sc, S.cerevisiae; Scoe, Streptomyces coelicolor; Sme, Sinorhizobium meliloti; Sp, S.pombe; Spn, Streptococcus pneumoniae; SRV, Snakehead retrovirus; Sso, S.solfataricus; Ssp, Synechocystis sp. PCC 6803; T4, Bacteriophage T4; Tac, Thermoplasma acidophilum; Tm, Thermotoga maritima; WDSV, Walleye dermal sarcoma virus; WEHV1, Walleye epidermal hyperplasia virus type 1; WEHV2, Walleye epidermal hyperplasia virus type 2; WssV, Shrimp white spot syndrome virus; ZRV, Zebrafish endogenous retrovirus.
Figure 1
Figure 1
(Previous two pages) Multiple alignment of a selected set of 2H domains. Proteins are represented by their gene names, species abbreviations and gi numbers. The 85% consensus shown below the alignment was based on the following amino acid classes: h, hydrophobic residues (L,I,Y,F,M,W,A,C,V) and l, aliphatic (L,I,A,V) residues shaded yellow; o, alcohol (S,T) group containing residues, shaded blue. The secondary structure of the Arabidopsis Appr>p cyclic phosphodiesterase is shown above the alignment, where H denotes residues present in helices and E (extended) in strands. Family specific groupings are shown to the right of the alignment. Species abbreviations are as follows: Aae, Aquifex aeolicus; Af, A.fulgidus; Ana, Anabaena sp.PCC 7120; Ap, A.pernix; ARV, Avian rotavirus; At, Arabidopsis thaliana; Atu, Agrobacterium tumefaciens; Bmel, Brucella melitensis; Bs, B.subtilis; Bst, Bacillus stearothermophilus; BV, Berne virus; Ca, Carassius auratus; Cac, Clostridium acetobutylicum; Ccr, Caulobacter crescentus; Ce, Caenorhabditis elegans; Cgl, C.glutamicum; CIV, Chilo iridescent virus; Ddi, D.discoideum; Dm, Drosophila melanogaster; Drad, Deinococcus radiodurans; Ec, E.coli; Feac, Ferroplasma acidarmanus; FPV, Fowlpox virus; HCoV, Human coronavirus; HRV, Human rotavirus; Hs, Homo sapiens; MHV, Mouse hepatitis virus; Mj, M.jannaschii; Mkan, M.kandleri; Mlo, M.loti; Mm, Mus musculus; Mma, M.mazei Goe1; Mta, M.thermautotrophicus; Mtu, Mycobacterium tuberculosis; Pa, P.abyssi; Ph, Pyrococcus horikoshii; Psa, P.aeruginosa; Rsol, Ralstonia solanacearum; Sa, Staphylococcus aureus; Sc, S.cerevisiae; Scoe, Streptomyces coelicolor; Sme, Sinorhizobium meliloti; Sp, S.pombe; Spn, Streptococcus pneumoniae; SRV, Snakehead retrovirus; Sso, S.solfataricus; Ssp, Synechocystis sp. PCC 6803; T4, Bacteriophage T4; Tac, Thermoplasma acidophilum; Tm, Thermotoga maritima; WDSV, Walleye dermal sarcoma virus; WEHV1, Walleye epidermal hyperplasia virus type 1; WEHV2, Walleye epidermal hyperplasia virus type 2; WssV, Shrimp white spot syndrome virus; ZRV, Zebrafish endogenous retrovirus.
Figure 2
Figure 2
Evolutionarily conserved structure of the 2H phosphoesterase domain. (A) Structure of the plant CPDase (PDB id: 1FSI) showing the secondary structure elements conserved across the superfamily. The residues involved in catalysis are shown in the ball-and-stick representation. (B) Schematic representation of the secondary structure topology of the 2H phophoesterase domain. β strands are represented as arrows, while the α helices are rods. Secondary structural element numbering is based on ascending order from the N-terminal end. Side chains comprising the catalytic core are shown in greater detail. Inserts and sequence synapomorphies are shown with the number of residues in inserts given in brackets. Note the two topologically similar and equivalent structural units.
Figure 2
Figure 2
Evolutionarily conserved structure of the 2H phosphoesterase domain. (A) Structure of the plant CPDase (PDB id: 1FSI) showing the secondary structure elements conserved across the superfamily. The residues involved in catalysis are shown in the ball-and-stick representation. (B) Schematic representation of the secondary structure topology of the 2H phophoesterase domain. β strands are represented as arrows, while the α helices are rods. Secondary structural element numbering is based on ascending order from the N-terminal end. Side chains comprising the catalytic core are shown in greater detail. Inserts and sequence synapomorphies are shown with the number of residues in inserts given in brackets. Note the two topologically similar and equivalent structural units.
Figure 3
Figure 3
Maximum-likelihood phylogenetic tree, domain architectures and operon organization of 2H proteins. All branches with RELL bootstrap support <50% are collapsed and the values that support a node are shown in the remaining cases. Conserved gene neighborhoods (operons) that are discussed in the text are represented by boxed arrows with the gene names written within. Domain abbreviations are as follows: 2H, 2H phosphoesterase; KH, K homology; UBA, ubiquitin associated; SH3, Src homology 3; PGAM, phosphoglycerate mutase; PNK, P-loop nucleotide kinase; ZK, zinc knuckle; rvp, retroviral aspartyl protease; INT, integrase; Hismacro, phosphoesterase domain found in Macro histone 2 and the Appr-1″-p processing enzyme. Mj1316 domain is a predicted RNA binding domain typified by the protein MJ1316 protein of Methanococcus. The species abbreviations are as in Figure 1.
Figure 4
Figure 4
Surface view of family-specific conserved residues in different 2H phosphoesterase families. The family-specific conserved areas are shown in blue and the catalytic residues are shown in red. Note the pocket forming the active site. (A) Archaeo-bacterial LigT-like group. (B) CGI-18-like eukaryotic LigT proteins. (C) Top view of the same. (D) CG16790 family. (E) YjcB family.

References

    1. Aravind L. and Koonin,E.V. (1998) The HD domain defines a new superfamily of metal-dependent phosphohydrolases. Trends Biochem. Sci., 23, 469–472. - PubMed
    1. Aravind L. (1999) An evolutionary classification of the metallo-beta-lactamase fold proteins. In Silico Biol., 1, 69–91. - PubMed
    1. Aravind L. and Koonin,E.V. (1998) Phosphoesterase domains associated with DNA polymerases of diverse origins. Nucleic Acids Res., 26, 3746–3752. - PMC - PubMed
    1. Aravind L. and Koonin,E.V. (1998) A novel family of predicted phosphoesterases includes Drosophila prune protein and bacterial RecJ exonuclease. Trends Biochem. Sci., 23, 17–19. - PubMed
    1. Koonin E.V. and Tatusov,R.L. (1994) Computer analysis of bacterial haloacid dehalogenases defines a large superfamily of hydrolases with diverse specificity. Application of an iterative approach to database search. J. Mol. Biol., 244, 125–132. - PubMed

MeSH terms

Substances