Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct;91(5):669-686.
doi: 10.1007/s00239-023-10128-x. Epub 2023 Aug 22.

Systematic Analysis of Diverse Polynucleotide Kinase Clp1 Family Proteins in Eukaryotes: Three Unique Clp1 Proteins of Trypanosoma brucei

Affiliations

Systematic Analysis of Diverse Polynucleotide Kinase Clp1 Family Proteins in Eukaryotes: Three Unique Clp1 Proteins of Trypanosoma brucei

Motofumi Saito et al. J Mol Evol. 2023 Oct.

Abstract

The Clp1 family proteins, consisting of the Clp1 and Nol9/Grc3 groups, have polynucleotide kinase (PNK) activity at the 5' end of RNA strands and are important enzymes in the processing of some precursor RNAs. However, it remains unclear how this enzyme family diversified in the eukaryotes. We performed a large-scale molecular evolutionary analysis of the full-length genomes of 358 eukaryotic species to classify the diverse Clp1 family proteins. The average number of Clp1 family proteins in eukaryotes was 2.3 ± 1.0, and most representative species had both Clp1 and Nol9/Grc3 proteins, suggesting that the Clp1 and Nol9/Grc3 groups were already formed in the eukaryotic ancestor by gene duplication. We also detected an average of 4.1 ± 0.4 Clp1 family proteins in members of the protist phylum Euglenozoa. For example, in Trypanosoma brucei, there are three genes of the Clp1 group and one gene of the Nol9/Grc3 group. In the Clp1 group proteins encoded by these three genes, the C-terminal domains have been replaced by unique characteristics domains, so we designated these proteins Tb-Clp1-t1, Tb-Clp1-t2, and Tb-Clp1-t3. Experimental validation showed that only Tb-Clp1-t2 has PNK activity against RNA strands. As in this example, N-terminal and C-terminal domain replacement also contributed to the diversification of the Clp1 family proteins in other eukaryotic species. Our analysis also revealed that the Clp1 family proteins in humans and plants diversified through isoforms created by alternative splicing.

Keywords: Family protein; Gene duplication; Molecular evolution; Polynucleotide kinase Clp1; Protein domain structure; RNA processing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Numbers of Clp1 family proteins differ among eukaryotic taxa. a The distribution of the number of Clp1 family proteins in each taxon is shown. Taxon names are shown above each figure (see also Table 1). n indicates the number of species used in the analysis. b Average numbers (± standard deviations) of Clp1 genes are mapped against the evolutionary phylogenetic tree of eukaryotes of (Adl et al. 2019) (modified). Because the origin of the eukaryotes is unclear on this phylogenetic tree, it is indicated by a dotted line. The black circle indicates the possible time point at which gene duplication is presumed to have occurred
Fig. 2
Fig. 2
Phylogenetic relationships among Clp1 family proteins (Clp1 group) and their protein domain structures. A phylogenetic tree constructed from a total of 110 Clp1 group protein sequences is shown (Supplementary Table S2b). The 110 Clp1 group sequences consist of 107 protein sequences from 94 eukaryotes, two protein sequences from two archaea, and one protein sequence from a bacterium. Each full-length amino acid sequence of the Clp1 group proteins was used for the phylogenetic analysis, and midpoint rooting was applied during tree visualization. The LG + F + R8 model was used for the phylogenetic tree. The entire molecular evolutionary phylogenetic tree of the Clp1 family proteins is shown in Supplementary Fig. S1. The scale bar under the tree indicates the number of amino acid substitutions per site. Symbol for protein domain structures are summarized in the box. Protein domains were identified using two methods: (i) searches against the Pfam database and (ii) manual amino acid sequence alignments (all symbols are rectangles). Protein names of the query sequences used to detect the protein domain structures are indicated in red letters (see Supplementary Figs. S3–S6). See also Supplementary Table S4 for details of domain structures
Fig. 3
Fig. 3
Phylogenetic relationships among Clp1 family proteins (Nol9/Grc3 group) and their protein domain structures. A phylogenetic tree constructed from a total of 144 Nol9/Grc3 group protein sequences from 133 eukaryotes is shown (Supplementary Table S2b). Each full-length amino acid sequence of the Nol9/Grc3 group proteins was used for the phylogenetic analysis. Protein names of the query sequences used to detect protein domain structures are indicated in red letters (see Supplementary Figs. S7–S12). The location of the root of the phylogenetic tree is shown in Fig. 2. See also the legend to Fig. 2 for other details
Fig. 4
Fig. 4
Biochemical characterization of the recombinant Clp1 group proteins of T. brucei. a Purification of recombinant Tb-Clp1-t2 protein and its PNK activity. SDS–PAGE (10–20%) analysis of the purification of recombinant Tb-Clp1-t2 protein using TALON Metal (Cobalt) Affinity Chromatography stained with Coomassie Brilliant Blue (top). Western blot analysis with an anti-His-tag antibody (middle). PNK activity against single-stranded RNA (ssRNA) as the substrate (bottom). Fractions of the protein peak are indicated by red circles. b, c Purification of the recombinant Tb-Clp1-t1 and Tb-Clp1-t3 proteins (western blotting analysis, top) and their PNK activities (bottom). Arrows indicate the positions of each recombinant protein. Recombinant Tb-Clp1-t2 protein was used as the positive control. df Comparison of PNK activities of three Clp1 proteins: d bacterial Thermus scotoductus Clp1 (Ts-Clp1), e archaeal Pyrococcus furiosus Clp1 (Pf-Clp1), and f eukaryotic Tb-Clp1-t2 on four substrates (ssRNA, ssDNA, double-stranded RNA [dsRNA], and dsDNA) (Color figure online)
Fig. 5
Fig. 5
Summary of diverse Clp1 family proteins in the three domains of life. A schematic illustration of the Clp1 family proteins of representative species in prokaryotes (Bacteria and Archaea) and the three major eukaryotic taxonomic groups (Discobids, Opisthokonta, and Archaeplastida) (Gabaldon 2021). The color of each circle corresponds to the protein category, as indicated in the figure. t1–t3 inside the circle indicates protein types 1–3 of Tb-Clp1, and a–c indicates distinct individual proteins. The number of protein isoforms is described as 1–5. References: *1 (Saito et al. 2019); *2, this study; *3 (Jain and Shuman 2009); *4 (Weitzer and Martinez 2007); *5 (Ramirez et al. 2008); *6 (Heindl and Martinez 2010); *7 (Braglia et al. 2010). See text for details (Color figure online)

Similar articles

Cited by

References

    1. Abelson J, Trotta CR, Li H. tRNA splicing. J Biol Chem. 1998;273:12685. doi: 10.1074/jbc.273.21.12685. - DOI - PubMed
    1. Adl SM, Bass D, Lane CE, Lukes J, Schoch CL, Smirnov A, Agatha S, Berney C, Brown MW, Burki F, Cardenas P, Cepicka I, Chistyakova L, Del Campo J, Dunthorn M, Edvardsen B, Eglit Y, Guillou L, Hampl V, Heiss AA, Hoppenrath M, James TY, Karnkowska A, Karpov S, Kim E, Kolisko M, Kudryavtsev A, Lahr DJG, Lara E, Le Gall L, Lynn DH, Mann DG, Massana R, Mitchell EAD, Morrow C, Park JS, Pawlowski JW, Powell MJ, Richter DJ, Rueckert S, Shadwick L, Shimano S, Spiegel FW, Torruella G, Youssef N, Zlatogursky V, Zhang Q. Revisions to the classification, nomenclature, and diversity of eukaryotes. J Eukaryot Microbiol. 2019;66:4. doi: 10.1111/jeu.12691. - DOI - PMC - PubMed
    1. Apostol BL, Westaway SK, Abelson J, Greer CL. Deletion analysis of a multifunctional yeast tRNA ligase polypeptide. Identification of essential and dispensable functional domains. J Biol Chem. 1991;266:7445. doi: 10.1016/S0021-9258(20)89467-8. - DOI - PubMed
    1. Berthelot C, Brunet F, Chalopin D, Juanchich A, Bernard M, Noel B, Bento P, Da Silva C, Labadie K, Alberti A, Aury JM, Louis A, Dehais P, Bardou P, Montfort J, Klopp C, Cabau C, Gaspin C, Thorgaard GH, Boussaha M, Quillet E, Guyomard R, Galiana D, Bobe J, Volff JN, Genet C, Wincker P, Jaillon O, Roest Crollius H, Guiguen Y. The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun. 2014;5:3657. doi: 10.1038/ncomms4657. - DOI - PMC - PubMed
    1. Braglia P, Heindl K, Schleiffer A, Martinez J, Proudfoot NJ. Role of the RNA/DNA kinase Grc3 in transcription termination by RNA polymerase I. EMBO Rep. 2010;11:758. doi: 10.1038/embor.2010.130. - DOI - PMC - PubMed

Publication types

MeSH terms