Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr:129:105725.
doi: 10.1016/j.meegid.2025.105725. Epub 2025 Feb 5.

Diamonds in the rif: Alignment-free comparative genomics analysis reveals strain-transcendent Plasmodium falciparum antigens amidst extensive genetic diversity

Affiliations

Diamonds in the rif: Alignment-free comparative genomics analysis reveals strain-transcendent Plasmodium falciparum antigens amidst extensive genetic diversity

Jonathan G Lawton et al. Infect Genet Evol. 2025 Apr.

Abstract

The repetitive interspersed family (rif) and subtelomeric variable open reading frames (stevor) are highly diverse multi-gene families in the malaria parasite Plasmodium falciparum. Embedded on the surface of infected erythrocytes, RIFIN and STEVOR proteins are involved in cytoadherence and immune evasion, but the extent of family-wide sequence diversity across strains has yet to be comprehensively investigated in light of improved resolution of the subtelomeric genome sequences. Using a k-mer frequency approach, we analyzed long-read genomic sequence data from 18 geographically diverse P. falciparum genome assemblies, including lab strains and clinical isolates. We hypothesized that k-mer sequence comparison can identify existing RIFIN and STEVOR subgroups, identify novel subgroups, and generate more robust and reliable estimates of family-wide sequence diversity. Full-length RIFIN and STEVOR proteins shared on average 49.5% and 61.1% amino acid k-mer similarity, respectively, which fell to 25.1% and 20% in the hypervariable regions alone. Despite this diversity, we identified 11 RIFINs and five STEVORs that were conserved across strains above expected thresholds. A subset of these strain-transcendent genes was similar and syntenic to genes in related Plasmodium species, suggesting an ancient origin. Additionally, in silico structural predictions from AlphaFold showed that three-dimensional structures of RIFIN receptor-binding regions were more conserved than their sequences suggested. Evolutionarily constrained RIFINs and STEVORs may have critical functions in parasite survival or pathogenesis. This study provides a framework for investigating diversity in highly variable multi-gene families and highlights the potential of strain-transcendent RIFIN and STEVOR proteins as vaccine candidates.

Keywords: Comparative genomics; Malaria; Plasmodium falciparum; RIFIN; STEVOR; Variant surface antigens.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest Research Support This research received no external financial or non-financial support. Relationships There are no additional relationships to disclose. Patents and Intellectual Property There are no patents to disclose. Other Activities There are no additional activities to disclose.

Figures

Fig. 1.
Fig. 1.. Curation and characterization of RIFIN and STEVOR sequences.
A text search of the PlasmoDB database yielded preliminary RIFIN or STEVOR protein sequences (A). We verified and subclassified each sequence as RIFIN-A, RIFIN-B, or RIFIN-U (unclassifiable) using STRIDE. We removed sequences annotated as “Unlikely” (A) and those under 250 amino acids long (B), indicated in red, keeping those indicated in green. Multiple sequence alignment of the remaining 2793 RIFIN and 505 STEVOR amino acid sequences were performed using MAFFT and MUSCLE, respectively, and visualized using Jalview. Blue positions are more conserved, white positions indicate mismatches, and gray positions indicate gaps. The top bands demarcate specific protein regions; signal peptide (S), variable region 1 (V1), PEXEL motif (P), semiconserved region (SC), hypervariable region (HV), transmembrane domain (T), and constant region (C). The average number of amino acids comprising each section (standard deviation) is displayed above each alignment. The red triangle indicates the 25 amino acid indel that differentiates RIFIN-As and RIFIN-Bs. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 2.
Fig. 2.. The RIFIN and STEVOR protein families harbor extensive sequence diversity.
Each sequence is represented as a row and a column in the RIFIN (A) and STEVOR (B) heatmaps. The similarities of all pairs of sequences were calculated as the fractional common k-mer count (Edgar similarity, k = 3). The average pairwise Edgar similarities for all combinations of full-length protein sequences or sub-sequences were computed (C). Thick black boxes correspond to the data in the panel A and B heatmaps.
Fig. 3.
Fig. 3.. Identification of strain-transcendent RIFIN and STEVOR variants.
For each RIFIN and STEVOR protein, we identified the most similar antigens in the other P. falciparum isolates according to their Edgar similarities. We defined “strain-transcendent” RIFIN (A) and STEVOR (B) sequences as those ≥75% conserved in ≥15 isolates. One sequence from each group is graphed. The presence of strain-transcendent RIFIN (C) and STEVOR (D) groups within the 18 P. falciparum isolates is indicated using solid circles. Orange dots indicate sequences that are syntenic across strains. Asterisks indicate sequences with syntenic orthologs in extant Plasmodium relatives. The genomic location of each 3D7 RIFIN and STEVOR is shown in (E), with strain-transcendent members highlighted in yellow. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4.
Fig. 4.. RIFIN and STEVOR sequences form domain-specific subgroups.
Principal component analyses (PCA) were performed on k-mer frequency tables for full-length RIFIN (A) and STEVOR (C) sequences (k = 3), as well as the hypervariable regions (B & D). Each circle is a RIFIN or STEVOR sequence. The distance between circles represents their degree of sequence similarity. The colors indicate the major RIFIN/STEVOR sequence subgroups (A/B/U). Black circles indicate strain-transcendent variants. Shaded ovals indicate proposed RIFIN/STEVOR subgroups. Tables display sequence logos of identified MEME motifs for the RIFIN-A2 (E) and STEVOR-A2 (F) subgroups, as well as the number of sequences containing each motif, the average motif start site, and E-values for statistical significance derived from likelihood-based models.
Fig. 5.
Fig. 5.. RIFIN hypervariable region structures are more conserved than full proteins.
Using AlphaFold, we predicted three-dimensional protein structures for full-length RIFINs (A) and hypervariable regions alone (B) and compared the pairwise structural similarities with TM-align. We used these distance matrices for principal coordinate analysis to visualize the relationships and clustering patterns of RIFIN structures (D–F). Each circle is a RIFIN structure. The distance between circles represents their degree of structural similarity. Colored circles indicate the major RIFIN sequence subgroups (A/B/U). Black circles represent strain-transcendent-variants.

Similar articles

References

    1. Abdel-Latif MS, Dietz K, Issifou S, Kremsner PG, Klinkert M-Q, 2003. Antibodies to plasmodium falciparum Rifin proteins are associated with rapid parasite clearance and asymptomatic infections. Infect. Immun. 71 (11), 6229–6233. 10.1128/IAI.71.11.6229-6233.2003. - DOI - PMC - PubMed
    1. Albrecht L, Merino EF, Hoffmann EHE, Ferreira MU, de Mattos Ferreira RG, Osakabe AL, Dalla Martha RC, Ramharter M, Durham AM, Ferreira JE, del Portillo HA, Wunderlich G, 2006. Extense variant gene family repertoire overlap in Western Amazon plasmodium falciparum isolates. Mol. Biochem. Parasitol. 150 (2), 157–165. 10.1016/j.molbiopara.2006.07.007. - DOI - PubMed
    1. Amos B, Aurrecoechea C, Barba M, Barreto A, Basenko EY, Bażant W, Belnap R, Blevins AS, Böhme U, Brestelli J, Brunk BP, Caddick M, Callan D, Campbell L, Christensen MB, Christophides GK, Crouch K, Davis K, DeBarry J, Zheng J, 2022. VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center. Nucleic Acids Res. 50 (D1), D898–D911. 10.1093/nar/gkab929. - DOI - PMC - PubMed
    1. Bailey TL, Johnson J, Grant CE, Noble WS, 2015. The MEME suite. Nucleic Acids Res. 43 (W1), W39–W49. 10.1093/nar/gkv416. - DOI - PMC - PubMed
    1. Benavente ED, Oresegun DR, de Sessions PF, Walker EM, Roper C, Dombrowski JG, de Souza RM, Marinho CRF, Sutherland CJ, Hibberd ML, Mohareb F, Baker DA, Clark TG, Campino S, 2018. Global genetic diversity of var2csa in plasmodium falciparum with implications for malaria in pregnancy and vaccine development. Sci. Rep. 8 (1). 10.1038/s41598-018-33767-3. Article 1. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources