Exoribonuclease superfamilies: structural analysis and phylogenetic distribution

Y Zuo¹, M P Deutscher

Affiliations

PMID: 11222749
PMCID: PMC56904
DOI: 10.1093/nar/29.5.1017

Review

Exoribonuclease superfamilies: structural analysis and phylogenetic distribution

Y Zuo et al. Nucleic Acids Res. 2001.

. 2001 Mar 1;29(5):1017-26.

doi: 10.1093/nar/29.5.1017.

Authors

Y Zuo¹, M P Deutscher

Affiliation

¹ Department of Biochemistry and Molecular Biology, University of Miami School of Medicine, PO Box 016129, Miami, FL 33101-6129, USA.

PMID: 11222749
PMCID: PMC56904
DOI: 10.1093/nar/29.5.1017

Abstract

Exoribonucleases play an important role in all aspects of RNA metabolism. Biochemical and genetic analyses in recent years have identified many new RNases and it is now clear that a single cell can contain multiple enzymes of this class. Here, we analyze the structure and phylogenetic distribution of the known exoribonucleases. Based on extensive sequence analysis and on their catalytic properties, all of the exoribonucleases and their homologs have been grouped into six superfamilies and various subfamilies. We identify common motifs that can be used to characterize newly-discovered exoribonucleases, and based on these motifs we correct some previously misassigned proteins. This analysis may serve as a useful first step for developing a nomenclature for this group of enzymes.

PubMed Disclaimer

Figures

**Figure 1**
Schematic representation of the structure of RNR family proteins. RNR proteins normally contain a variable N-terminal sequence which is conserved within subfamilies, a conserved central region featuring four conserved sequence motifs, and a C-terminal S1 RNA-binding domain. The conserved sequence patterns of the four motifs in the central region are also shown. The syntax of the patterns follows that used in PHI-BLAST searches (http://www.ncbi.nlm.nih.gov/BLAST/pattern.html). Those residues conserved in >80% of the analyzed sequences are presented in bold lettering. Motif IV, which contains a long stretch of highly conserved or invariant residues, was designated as an RNase II signature in Prosite (http://expasy.cbr.nrc.ca/cgi-bin/nicedoc.pl?PDOC00904).

**Figure 2**
Sequence conservation patterns in the DEDD family. This abbreviated sequence alignment was generated by ClustalX with some editing based on BLAST searches. The focus is on the DEDD RNases, but some DEDD DNases are also listed for comparison. Sequences are from the NCBI non-redundant protein sequence database, unless otherwise stated. Emphasis has been placed on presenting a wide variety of sequences, rather than on completeness due to space limitations. Included are the eight *E.coli* DEDD family members: EX1_ECOLI, DNA exonuclease I (accession no. G462029); EXOX_ECOLI, DNA exonuclease X (G9789746); DP3E_ECOLI, DNA polymerase III ɛ subunit (G118805); RNT_ECOLI, RNase T (G266952); ORN_ECOLI, oligoribonuclease (G1730261); RND_ECOLI, RNase D (G133152); DPO1_ECOLI, DNA polymerase I (G118825); DPO2_ECOLI, DNA polymerase II (G118829); six *S.cerevisiae* DEDD RNases: ORN_YEAST, yeast oligoribonuclease (G1730818); REX1_YEAST, yeast RNA exonuclease 1 (G2131716); REX3_YEAST, yeast RNA exonuclease 3 (G6323136); REX4_YEAST, yeast RNA exonuclease 4 (G6324493); PAN2_YEAST, Pan2p subunit of yeast poly(A)-binding protein-dependent poly(A) nuclease (G1709565); RRP6_YEAST, yeast exosome component Rrp6p (G6324574); and proteins from other model organisms: EX1_HAEIN, *H.influenzae* DNA exonuclease I (G1169569); DP3E_AQUAE, *Aquifex aeolicus* DNA polymerase III ɛ subunit (G6014995); DPO3_BACSU, *B.subtilis* DNA polymerase III α chain (G118793); DP3E_HAEIN, *H.influenzae* DNA polymerase III ɛ subunit (G1169396); RNT_VIBCH, *Vibrio cholerae* RNase T (G9655469); RNT_BUCSP, *Buchnera sp.* RNase T (G10038870); RNT_XYLFA, *Xylella fastidiosa* RNase T (G9107281); RNT_PSEAE, *Pseudomonas aeruginosa* RNase T (G9949677); RNT_HAEIN, *H.influenzae* RNase T (G1173107); ORN_HAEIN, *H.influenzae* oligoribonuclease (G1176352); ORN_MYCTU, *Mycobacterium tuberculosis* oligoribonuclease (G7227904); ORN_HUMAN, human oligoribonuclease (G7227908); YAA4_SCHPO, *S.pombe* Pan2p homolog (G1175466); YPO4_CAEEL, *C.elegans* Pan2p homolog (G1730942); PAN2_DROME, *Drosophila melanogaster* Pan2p homolog (G7303975); PAN2_HUMAN, human Pan2p homolog (translated from AC023500 with confirmation from ESTs, missing the very C-terminus in G7662258); PARN_HUMAN, human DAN nuclease, a poly(A)-specific 3′ to 5′ exoribonuclease (G4505611); PARN_SCHPO, *S.pombe* DAN nuclease homolog (G7491557); PARN_CAEEL, *C.elegans* DAN nuclease homolog (G7505706); RND_HAEIN, *H.influenzae* RNase D (G1173094); RND_RICPR, *Rickettsia prowazekii* RNase D (G7467941); RND_MYCTU, *M.tuberculosis* Rv2681 protein, a RNase D homolog (G7477317); PMC2_HUMAN, human polymyositis-scleroderma overlap syndrome-related nucleolar 100 kDa protein (PM/Scl autoantigen P100) (G8928564); PMC2_DROME, *D.melanogaster* homolog of PM/Scl autoantigen P100 (G7299933); DPOL_ARCFU, *Archaeoglobus fulgidus* DNA polymerase (G3122019); DPOD_HUMAN, human DNA polymerase δ catalytic chain (G118839); DPOE_HUMAN, human DNA polymerase ɛ catalytic subunit A (G1352309); EGL_DROME, *D.melanogaster egl* gene product (G7291631); WRN_HUMAN, human Werner syndrome helicase (G6136393); RND_SYNY3, *Synechocystis* hypothetical protein (G1001530); RP422_RICPR, *R.prowazekii* hypothetical protein RP422 (G7467752). The three conserved DEDD motifs are indicated at the top. Highly conserved residues among all family members are highlighted in red. The star at the top marks the fifth highly-conserved acidic residue (highlighted in red) between motifs II and III. Number of residues from the N-termini or residues between conserved blocks are indicated in parentheses. Residues that are highly conserved only within a subfamily are highlighted in blue. Yellow squares highlight some of the characteristic sequence motifs of subfamilies such as two of the three positively-charged, aromatic motifs specific to RNase T proteins, the sequence motif around the fifth conserved acidic residue in oligoribonucleases, and the characteristic motif III of RNase D.

**Figure 3**
Sequence comparisons of RBN family proteins. This abbreviated sequence alignment was generated by ClustalX with some editing based on BLAST searches. The sequences are from the NCBI non-redundant protein sequence database. The sequences included are: RBN_ECOLI, *E.coli* RNase BN (accession no. G418487); RBN_VIBCH, *V.cholerae* RNase BN (G9657340); RBN_HAEIN, *H.influenzae* RNase BN (G1176326); RBN_PSEAE, *P.aeruginosa* RNase BN (G9946857); RBN_NEIME, *Neisseria meningitidis* RNase BN (G7225749, G7379426); RBN_VITSP, *Vitreoscilla sp.* RNase BN (G3493604); YFKH_BACSU, *B.subtilis* transporter homolog yfkH (G7475937); YFKH_STRCO, *Streptomyces coelicolor* A3(2) yfkH homolog (G6425606); YFKH_ENTFA, *Enterococcus faecalis* yfkH homolog (G3608389); YFKH_PSEAE, *P.aeruginosa* yfkH homolog (G9948830); YFKH_RICPR, *R.prowazekii* hypothetical protein RP496 (G7467777); BRKB_BORPE, *Bordetella pertussis* brkB protein (G2120987); BRKB_XYLFA, *X.fastidiosa* brkB homolog (G9105274); BRKB_DEIRA, *Deinococcus radiodurans* brkB homolog (G7473422); BRKB_SYNY3, *Synechocystis sp.* (strain PCC 6803) brkB homolog (G7469712); YHJD_ECOLI, *E.coli* *yhjD* gene product (G586684); YHJD_MYCTU, *M.tuberculosis* hypothetical protein Rv3335c (G7477578); YHJD_STRCO, *S.coelicolor* A3(2) putative integral membrane protein (G7649576); RBN_HELPY, *H.pylori* hypothetical protein HP1407 (G7463952); RBN_CAMJE, *Campylobacter jejuni* putative RNase BN (G6968645); Rv2707, *M.tuberculosis* (strain H37Rv) hypothetical protein Rv2707 (G7477329); AQ_453, *A.aeolicus* hypothetical protein aq_453 (G7517605); CT132, *Chlamidia trachomatis* hypothetical protein CT132 (G7468636). The secondary structure noted by the letter H for helix is that predicted for *E.coli* RNase BN by TMHMM (http://www.cbs.dtu.dk/services/TMHMM-1.0/). All other superfamily members are predicted to have a similar secondary structure. Secondary structure predicted by PredictProtein (http://cubic.bioc.columbia.edu/predictprotein/) gave similar results, but with more helices at the N-terminus and the putative loop regions. Residues that are highly conserved in RNase BN and its homologs are highlighted in red. Corresponding residues in other subfamilies are also highlighted if they are conserved.

**Figure 4**
Schematic representation of the structure of PDX family proteins. Besides the PDX domain, PNPases contain extra domains at both termini: a highly conserved N-terminal domain of unknown function, and two C-terminal RNA-binding domains, KH and S1. Both KH and S1 RNA-binding domains are highly conserved among PNPases. Other PDX family members (RPH homologs) normally contain only a single highly-conserved PDX domain characterized by three conserved sequence motifs. The sequence patterns of these three motifs are shown including some specific patterns for subfamilies. The syntax of the patterns follows that used in PHI-BLAST searches (http://www.ncbi.nlm.nih.gov/BLAST/pattern.html). Those residues conserved in >80% of the analyzed sequences are shown in bold type.

**Figure 5**
Sequence comparisons of RRP4 family proteins. This abbreviated sequence alignment was generated by ClustalX with some editing based on BLAST searches. The sequences are from the NCBI non-redundant protein sequence database unless noted. Due to space limitations, only sequences from completed genomes and some model organisms are included here: RRP4_YEAST, *S.cerevisiae* exosome component Rrp4p (accession no. G6321860); YA2E_SCHPO, *S.pombe* rrp4p (G1175376); RRP4_DROME, *D.melanogaster* Rrp4p (G7291605); RRP4_HUMAN, human Rrp4p (alternative splicing product from U07561, missing 30 amino acids in G7657528); RRP4_ARATH, *Arabidopsis thaliana* Rrp4p homolog (G3850568); RRP4_PYRHO, *Pyrococcus horikoshii* Rrp4p homolog (G7519189); RRP4_METTH, *Methanobacterium thermoautotrophicum* Rrp4p homolog (G7482237); RRP4_ARCFU, *A.fulgidus* Rrp4p homolog (G7483018); RRP4_AERPE, *Aeropyrum pernix* Rrp4p homolog (G7516506); RRP40_YEAST, *S.cerevisiae* exosome component Rrp40p (G6324430); RRP40_SCHPO, *S.pombe* Rrp40p (G7490257); RRP40_CAEEL, *C.elegans* Rrp40p (G7504749); RRP40_DROME, *D.melanogaster* Rrp40p (translated from AE003585); RRP40_HUMAN, human Rrp40p (G8927588); RRP40_ARATH, *A.thaliana* Rrp40p homolog (alternative splicing product from AC006300); CSL4_YEAST, *S.cerevisiae* exosome component Cs14p (G1730832); CSL4_SCHPO, *S.pombe* Cs14p (G7491862); CSL4_CAEEL, *C.elegans* Cs14p (G7509960); CSL4_DROME, *D.melanogaster* Cs14p (G7297825); CSL4_HUMAN, human Cs14p (G7705612); CSL4_ARATH, *A.thaliana* Cs14p (translated from G2656024, missing C-term in G9758066); CSL4_PYRHO, *P.horikoshii* Cs14p homolog (G7429750); CSL4_ARCFU, *A.fulgidus* Cs14p homolog (G7429751); CSL4_METTH, *M.thermoautotrophicum* Cs14p homolog (G7429752); CSL4_AERPE, *A.pernix* Cs14p homolog (G7515775). The four sequence motifs are indicated at the top. Residues highly conserved among all RRP4 family members are highlighted in red, whereas those highlighted in blue indicate residues conserved within subfamilies. Numbers shown are the residues from the N-terminus or between blocks.

**Figure 6**
Schematic representation of the structure of 5PX family proteins. 5PX family members share two highly-conserved acidic N-terminal domains, N1 and N2, whereas the C-terminus is not conserved between Xrn1 and Rat1 subfamilies. The longer Xrn1 subfamily members have two basic C-terminal domains, C1 and C2. C1 is weakly conserved among Xrn1 proteins, whereas C2, which is proline-rich, is barely conserved.

See this image and copyright information in PMC

References

1. Deutscher M.P. and Li,Z. (2000) Exoribonucleases and their multiple roles in RNA metabolism. Prog. Nucleic Acids Res. Mol. Biol., 66, 67–105. - PubMed
1. Mian I.S. (1997) Comparative sequence analysis of ribonucleases HII, III, II, PH and D. Nucleic Acids Res. 25, 3187–3195. - PMC - PubMed
1. Moser M.J., Holley,W.R., Chatterjee,A. and Mian,I.S. (1997) The proofreading domain of Escherichia coli DNA polymerase I and other DNA and/or RNA exonuclease domains. Nucleic Acids Res., 25, 5110–5118. - PMC - PubMed
1. Shen V. and Schlessinger,D. (1982) RNase I, II and IV of Escherichia coli. In Boyer,P.D. (ed.), The Enzymes, vol. XV part B. Academic Press, New York, pp. 501–515.
1. Kasai T., Gupta,R.S. and Schlessinger,D. (1977) Exoribonucleases in wild type Escherichia coli and RNase II-deficient mutants. J. Biol. Chem., 252, 8950–8956. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Exoribonuclease superfamilies: structural analysis and phylogenetic distribution

Affiliation

Exoribonuclease superfamilies: structural analysis and phylogenetic distribution

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases