Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar;7(3):mgen000541.
doi: 10.1099/mgen.0.000541. Epub 2021 Mar 3.

Identification and phylogenetic analysis of RNA binding domain abundant in apicomplexans or RAP proteins

Affiliations

Identification and phylogenetic analysis of RNA binding domain abundant in apicomplexans or RAP proteins

Thomas Hollin et al. Microb Genom. 2021 Mar.

Abstract

The RNA binding domain abundant in apicomplexans (RAP) is a protein domain identified in a diverse group of proteins, called RAP proteins, many of which have been shown to be involved in RNA binding. To understand the expansion and potential function of the RAP proteins, we conducted a hidden Markov model based screen among the proteomes of 54 eukaryotes, 17 bacteria and 12 archaea. We demonstrated that the domain is present in closely and distantly related organisms with particular expansions in Alveolata and Chlorophyta, and are not unique to Apicomplexa as previously believed. All RAP proteins identified can be decomposed into two parts. In the N-terminal region, the presence of variable helical repeats seems to participate in the specific targeting of diverse RNAs, while the RAP domain is mostly identified in the C-terminal region and is highly conserved across the different phylogenetic groups studied. Several conserved residues defining the signature motif could be crucial to ensure the function(s) of the RAP proteins. Modelling of RAP domains in apicomplexan parasites confirmed an ⍺/β structure of a restriction endonuclease-like fold. The phylogenetic trees generated from multiple alignment of RAP domains and full-length proteins from various distantly related eukaryotes indicated a complex evolutionary history of this family. We further discuss these results to assess the potential function of this protein family in apicomplexan parasites.

Keywords: RAP domain; RNA-binding protein; phylogenetic tree; protein structure.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
Frequency of RAP proteins in different eukaryotic organisms. (a) The total number of RAP proteins discovered by HMM for each species. (b) Percentage of RAP proteins identified and normalized by the size of the proteome. Some species are excluded to simplify the visualization.
Fig. 2.
Fig. 2.
Phylogenetic analysis of the RAP domain in Eukaryota. The maximum-likelihood tree is built from an alignment of RAP domains extracted from 267 proteins corresponding to 19 different species. Alveolates are represented by Aconoidasida (Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei and Theileria annulata) in red, Conoidasida (Toxoplasma gondii and E. tenella) in brown, Chromerida (V. brassicaformis and Chromera velia) in blue, and Perkinsus marinus and Symbiodinium microadriaticum in purple. Viridiplantae are represented by Chlorophyta (G. pectorale, Monoraphidium neglectum and Ostreococcus tauri) in green, and Streptophyta (A. thaliana, Oryza sativa and Citrus sinensis) in pale green. Opisthokonta (H. sapiens, Danio rerio and Drosophila melanogaster) are indicated in grey. The green arc indicates the clade enriched in RAP proteins of Chlorophyta species. Bootstrap values (>50 %) are shown on respective branches. The scale indicates the number of substitutions per site.
Fig. 3.
Fig. 3.
Phylogenetic analysis of full-length RAP proteins in Apicomplexa and Chromerida. The maximum-likelihood tree is built from 175 protein sequences corresponding to eight different species. Aconoidasida (Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei and Theileria annulata) are represented in red, Conoidasida (Toxoplasma gondii and E. tenella) in brown, and Chromerida (V. brassicaformis and Chromera velia) in blue. The red arc shows the clade formed by the Aconoidasida. The turquoise arc indicates the RAP protein conserved in different species. Bootstrap values (>50 %) are shown on respective branches.The scale indicates the number of substitutions per site.
Fig. 4.
Fig. 4.
Conserved motifs identified in RAP proteins by MEME Suite. The sequences of RAP proteins from three groups, Apicomplexa–Chromerida, Chlorophyta and Metazoa, were analysed by MEME Suite. An additional group was made and regrouped 20 random sequences from each previous group. The motifs shown above are located in the RAP domain only and are aligned between them. The complete data are depicted in Fig. S5. The arrows point to the two residues of the PD-(D/E)XK endonuclease superfamily.
Fig. 5.
Fig. 5.
Structure predictions for RAP proteins from Plasmodium falciparum. (a) Left: the modelling template covering both HPRs and RAP domain from Plasmodium falciparum proteins, based on experimental structure of F-ATP synthase from Polytomella sp. Pringsheim 198.80 (PDB code: 6rd6 chain 2, residues 8–445). Right: the superposition of the two available modelling templates covering these domains (6rd6 chain 2 and 6z1p chain AS). (b) Structural model of the RAP domain of PF3D7_1024600 (residues 327–431), based on 6rd6 chain 2. (c) Top: the structural superposition of the two closest modelling templates for the RAP domain from PF3D7_1024600 6rd6 chain 2 residues 322–445 (blue) and 6z1p chain AS residues 591–685 (red). Bottom: the same superposition is shown in rainbow N-to-C terminus colouring. (d) Structural superposition of different modelling templates used for the RAP domain: 6rd6 chain 2 residues 322–445 (blue), 6z1p chain AS residues 591–685 (red), 3r3p chain A (orange) and 1vsr chain A (green). (e) Alignment of RAP domains from Plasmodium falciparum and F-ATP synthase (6rd6_2). The sequences and secondary structure from the two closest modelling templates are shown in the top and bottom rows of the alignment. The most conserved parts of the first and the second sequence motifs identified in Apicomplexan RAP domains (see Fig. 4, top row) are underlined with pink and magenta dashed lines, respectively. The red asterisks indicate the three residues of the PD-(D/E)XK superfamily. Positions of removed repeats are marked by dark vertical bars.

Similar articles

Cited by

References

    1. WHO World Malaria Report. Geneva: World Health Organization; 2019.
    1. Sidik SM, Huet D, Ganesan SM, Huynh MH, Wang T, et al. A genome-wide CRISPR screen in Toxoplasma identifies essential apicomplexan genes. Cell. 2016;166:1423–1435. doi: 10.1016/j.cell.2016.08.019. - DOI - PMC - PubMed
    1. Zhang M, Wang C, Otto TD, Oberstaller J, Liao X, et al. Uncovering the essential genes of the human malaria parasite Plasmodium falciparum by saturation mutagenesis. Science. 2018;360:eaap7847. doi: 10.1126/science.aap7847. - DOI - PMC - PubMed
    1. Sanderson T, Rayner JC. PhenoPlasm: a database of disruption phenotypes for malaria parasite genes. Wellcome Open Res. 2017;2:45. doi: 10.12688/wellcomeopenres.11896.2. - DOI - PMC - PubMed
    1. Lee I, Hong W. RAP – a putative RNA-binding domain. Trends Biochem Sci. 2004;29:567–570. doi: 10.1016/j.tibs.2004.09.005. - DOI - PubMed

Publication types