Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Jun 15;28(12):2342-52.
doi: 10.1093/nar/28.12.2342.

Evolutionary appearance of genes encoding proteins associated with box H/ACA snoRNAs: cbf5p in Euglena gracilis, an early diverging eukaryote, and candidate Gar1p and Nop10p homologs in archaebacteria

Affiliations

Evolutionary appearance of genes encoding proteins associated with box H/ACA snoRNAs: cbf5p in Euglena gracilis, an early diverging eukaryote, and candidate Gar1p and Nop10p homologs in archaebacteria

Y Watanabe et al. Nucleic Acids Res. .

Abstract

A reverse transcription-polymerase chain reaction (RT-PCR) approach was used to clone a cDNA encoding the Euglena gracilis homolog of yeast Cbf5p, a protein component of the box H/ACA class of snoRNPs that mediate pseudouridine formation in eukaryotic rRNA. Cbf5p is a putative pseudouridine synthase, and the Euglena homolog is the first full-length Cbf5p sequence to be reported for an early diverging unicellular eukaryote (protist). Phylogenetic analysis of putative pseudouridine synthase sequences confirms that archaebacterial and eukaryotic (including Euglena) Cbf5p proteins are specifically related and are distinct from the TruB/Pus4p clade that is responsible for formation of pseudouridine at position 55 in eubacterial (TruB) and eukaryotic (Pus4p) tRNAs. Using a bioinformatics approach, we also identified archaebacterial genes encoding candidate homologs of yeast Gar1p and Nop10p, two additional proteins known to be associated with eukaryotic box H/ACA snoRNPs. These observations raise the possibility that pseudouridine formation in archaebacterial rRNA may be dependent on analogs of the eukaryotic box H/ACA snoRNPs, whose evolutionary origin may therefore predate the split between Archaea (archaebacteria) and Eucarya (eukaryotes). Database searches further revealed, in archaebacterial and some eukaryotic genomes, two previously unrecognized groups of genes (here designated 'PsuX' and 'PsuY') distantly related to the Cbf5p/TruB gene family.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Strategy for RT–PCR cloning of E.gracilis Cbf5p cDNA. Primers (see Table 1) are shown as arrowheads at positions corresponding to their target binding sites.
Figure 2
Figure 2
cDNA sequence encoding E.gracilis Cbf5p, displayed with the deduced amino acid sequence. Locations of the trans-spliced leader sequence, TruB motifs I and II, and PUA domain (36,84) and the C-terminal KKE repeat are indicated. Alternative polyadenylation sites determined by 3′ RACE (see Fig. 1) are denoted by asterisks. Because in the gene sequence A residues immediately follow one of the deduced polyadenylation sites, the position of this site was assigned tentatively.
Figure 3
Figure 3
A portion of the Cbf5p sequence alignment showing the region encompassing TruB motifs I and II. Motifs are adapted from Koonin (36), with highly conserved residues boxed. The position of the conserved, functionally important Asp (see text) is marked with an asterisk. Sequences shown above the Euglena sequence are eukaryotic, sequences below are archaebacterial. GenBank accession numbers are: S.cerevisiae, AAA34473; K.lactis, AAC64862; Candida albicans, AAB94297; Emericella nidulans, AAB94296; Sartorya (Aspergillus) fumigata, AAB94298; Schizosaccharomyces pombe, CAB10131; D.melanogaster, AAC97117; C.elegans, CAB07244; Homo sapiens, AAB94299; Rattus norvegicus, P40615; A.fulgidus, AAB90995; M.jannaschii, AAB98132; Pyrococcus abyssi, CAB49444; Pyrococcus horikoshii, O59357; M.thermoautotrophicum, O26140. The A.pernix sequence is a fusion of two peptide sequences (BAA79973 and BAA79974) encoded by separate open reading frames (nucleotide sequence AP000060.1). In the complete alignment (available from the authors on request), non-Cbf5p-homologous regions of these two peptides were excluded.
Figure 4
Figure 4
Maximum likelihood phylogenetic tree of selected Cbf5p/TruB family sequences. Sequences are as listed in Figure 3, with GenBank accession numbers for additional sequences as follows: Thermotoga maritima, AAD35938; Aquifex aeolicus, AAC06885; Bacillus subtilis, CAB13539; E.coli, AAC76200; S.cerevisiae (Pus4p), P48567; S.pombe (Pus4p), CAA20692. Numbers on branches are the quartet puzzling support values from 1000 puzzling steps (64).
Figure 5
Figure 5
(A) Alignment of the N-terminal portion of putative Gar1p protein sequences from archaebacteria and some eukaryotes. GenBank accession numbers for these sequences are: P.abyssi, CAB49230; M.jannaschii, P81312; A.pernix, BAA79764; M.thermoautotrophicum, AAB85384; Encephalitozoon cuniculi, CAA07263; S.cerevisiae, P28007; D.melanogaster, S49193; Arabidopsis thaliana, AAF00626. The sequences of P.horikoshii, A.fulgidus, C.parvum and H.sapiens (the latter two being partial sequences from EST data) are from open reading frames in nucleotide sequences AP000007.1, AE001014, AA532317 and AA308727, respectively. (B) Alignment of putative Nop10p sequences from archaebacteria and some eukaryotes. Accession numbers are: P.abyssi, CAB49761; P.horikoshii, translation of an open reading frame in AP000004; A.fulgidus, O29724; M.thermoautotrophicum, O27362; M.jannaschii, P81303; A.thaliana, AAD25649; Trypanosoma brucei, translation from AA681026 (EST data). The sequences of S.cerevisiae and H.sapiens Nop10p are from (31) whereas the Aeropyrum Nop10p sequence is from an open reading frame that begins with TTG in the nucleotide sequence AP000059. Highly conserved residues are shown in white on a black background whereas gray shading indicates conservative substitutions.
Figure 6
Figure 6
(A) Alignment of the C-terminal portion of the novel PsuX protein sequence family (the full-length alignment is available upon request). The overlining denotes a highly conserved stretch of the PsuX alignment that contains two Asp residues and is reminiscent of a Ψ synthase motif (B). GenBank accession numbers for the PsuX sequences are: P.horikoshii, BAA30059; P.abyssi, CAB49759; M.jannaschii, Q60346; M.thermoautotrophicum, AAB85800; A.fulgidus, AAB90092; A.pernix, BAA79514; C.elegans, CAB60423; D.melanogaster (translation of open reading frame in GenBank accession number AC005334). The portions of the A.pernix and E.gracilis Cbf5p sequences that display similarity to the PsuX sequences are shown at the bottom of the alignment; these Cbf5p sequences correspond to K57 to L132 of accession number BAA79974 (A.pernix) and K127 to L205 (E.gracilis; see Fig. 2). Shading of residues is as described in Figure 5. (B) Alignment of different classes of Ψ synthase motifs (77) plus the overlined PsuX stretch shown in (A). Highly conserved positions are shown, with non-conserved positions indicated by ‘x’. Because the putative PsuX motif contains two Asp residues that might correspond to the catalytic Asp (indicated by the asterisk) in known Ψ synthases, two slightly different alignments are shown.
Figure 7
Figure 7
(A) Organization of the genes encoding archaebacterial homologs of Nop10p (filled rectangles). IF2-α, homolog of eukaryotic translation initiation factor 2-α; L44E and S27E, ribosomal protein genes. (B) Organization of the genes encoding archaebacterial homologs of Gar1p (filled rectangles). TFIIB, homolog of eukaryotic transcription factor IIB; APE0788, unidentified open reading frame; S8E, ribosomal protein gene. (C) Organization of the genes encoding putative archaebacterial PsuX orthologs (filled rectangles). SRP54/Ffh, homolog of eukaryotic signal recognition particle GTPase; L21E, ribosomal protein gene. Direction of transcription is indicated by arrows above the rectangles. Note that the sizes of genes and spacers are not drawn to scale, and the region shown may not represent the entire operon.

Similar articles

Cited by

References

    1. Eichler D.C. and Craig,N. (1994) Prog. Nucleic Acid Res. Mol. Biol., 49, 197–239. - PubMed
    1. Venema J. and Tollervey,D. (1995) Yeast, 11, 1629–1650. - PubMed
    1. Morrissey J.P. and Tollervey,D. (1995) Trends Biochem. Sci., 20, 78–82. - PubMed
    1. Lafontaine D.L.J., Bousquet-Antonelli,C., Henry,Y., Caizergues-Ferrer,M. and Tollervey,D. (1998) Genes Dev., 12, 527–537. - PMC - PubMed
    1. Schnare M.N. and Gray,M.W. (1990) J. Mol. Biol., 215, 73–83. - PubMed

Publication types

MeSH terms