Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 28:10:e69040.
doi: 10.7554/eLife.69040.

Common virulence gene expression in adult first-time infected malaria patients and severe cases

Affiliations

Common virulence gene expression in adult first-time infected malaria patients and severe cases

J Stephan Wichers et al. Elife. .

Abstract

Sequestration of Plasmodium falciparum(P. falciparum)-infected erythrocytes to host endothelium through the parasite-derived P. falciparum erythrocyte membrane protein 1 (PfEMP1) adhesion proteins is central to the development of malaria pathogenesis. PfEMP1 proteins have diversified and expanded to encompass many sequence variants, conferring each parasite a similar array of human endothelial receptor-binding phenotypes. Here, we analyzed RNA-seq profiles of parasites isolated from 32 P. falciparum-infected adult travellers returning to Germany. Patients were categorized into either malaria naive (n = 15) or pre-exposed (n = 17), and into severe (n = 8) or non-severe (n = 24) cases. For differential expression analysis, PfEMP1-encoding var gene transcripts were de novo assembled from RNA-seq data and, in parallel, var-expressed sequence tags were analyzed and used to predict the encoded domain composition of the transcripts. Both approaches showed in concordance that severe malaria was associated with PfEMP1 containing the endothelial protein C receptor (EPCR)-binding CIDRα1 domain, whereas CD36-binding PfEMP1 was linked to non-severe malaria outcomes. First-time infected adults were more likely to develop severe symptoms and tended to be infected for a longer period. Thus, parasites with more pathogenic PfEMP1 variants are more common in patients with a naive immune status, and/or adverse inflammatory host responses to first infections favor the growth of EPCR-binding parasites.

Keywords: P. falciparum; PfEMP1; RNA-seq; chromosomes; gene expression; infectious disease; microbiology; transcriptomics; variant surface antigens; virulence.

PubMed Disclaimer

Conflict of interest statement

JW, GT, TT, RK, BK, JS, Hv, JS, HS, RW, LT, FL, AS, IB, ET, RF, TO, TL, TG, MD, AB No competing interests declared

Figures

Figure 1.
Figure 1.. Subgrouping of patients into first-time infected and pre-exposed individuals based on antibody levels against P. falciparum.
In order to further characterize the patient cohort, plasma samples (n = 32) were subjected to Luminex analysis with the Plasmodium falciparum (P. falciparum) antigens AMA1, MSP1, and CSP, known to induce a strong antibody response in humans. With exception of patient #21, unsupervised clustering of the PCA-reduced data clearly discriminates between first-time infected (naive) and pre-exposed patients with higher antibody levels against tested P. falciparum antigens and also assigns plasma samples from patients with an unknown immune status into naive and pre-exposed clusters (A). Classification of patient #21 into the naive subgroup was confirmed using different serological assays assessing antibody levels against P. falciparum on different levels: a merozoite-directed antibody-dependent respiratory burst (mADRB) assay (Kapelski et al., 2014) (B), a parasitophorous vacuolar membrane-enclosed merozoite structure (PEMS)-specific ELISA (C), and a 262-feature protein microarray covering 228 well-known P. falciparum antigens detecting reactivity with individual antigens, and the antibody breadth of IgG (upper panel) and IgM (lower panel) (D). The boxes represent medians with IQR; the whiskers depict minimum and maximum values (range), with outliers located outside the whiskers. Serological assays revealed significant differences between patient groups (Mann-Whitney U test). Reactivity of patient plasma IgG and IgM with individual antigens in the protein microarray is presented as the volcano plot, highlighting the significant hits in red. Box plots represent antibody breadths by summarizing the number of recognized antigens out of 262 features tested. Data from all assays were used for an unsupervised random forest approach (E). The variable importance plot of the random forest model shows the decrease in prediction accuracy if values of a variable are permuted randomly. The decrease in accuracy was determined for each serological assay, indicating that the mADRB, ELISA, and Luminex assays are most relevant in the prediction of patient clusters (F). Venn chart showing the patient subgroups used for differential expression analysis (G). Patients with a known immune status based on medical reports were marked in all plots with filled circles in blue (naive) and grey (pre-exposed), and samples from patients with an unknown immune status are shown as open circles. Patient #21 is shown as a filled circle in grey with a cross, and patient #26 is represented by an open circle with a cross. ELISA: enzyme-linked immunosorbent assay; IQR: interquartile range; PCA: principal component analysis.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Early immune response in mild and severe malaria within the naive patient cluster.
Antibody reactivity against individual antigens within the three subgroups ‘naive with mild symptoms,’ ‘naive with severe symptoms,' and ‘pre-exposed with mild symptoms.' Sera from all volunteers were assessed on protein microarrays and data normalized to control spots containing no antigen (no DNA control spots). Median reactivities of the mild infected malaria-naive, severely infected malaria-naive as well as the mild infected with pre-exposure to malaria are represented as bar charts. IgG data is given for all 262 Plasmodium falciparum (P. falciparum) proteins spotted on the microarray representing 228 unique antigens (A). To estimate differences in immune response in mild and severe malaria within the malaria-naive population, normalized IgG (B) and IgM (C) antibody responses were compared in the two subpopulations. Differentially recognized antigens (p<0.05 and fold change >2) are depicted in red.
Figure 2.
Figure 2.. Overview of the methodology and differential core gene expression analysis.
Summary diagram of the approaches taken to analyze the differential expression of core and var genes. In principle, all samples were analyzed by sequencing of the RNA using next generation sequencing (NGS) and by sequencing of expressed sequence tags (EST) from the DBLα domain (A). Gene set enrichment analysis (GSEA) of gene ontology (GO) terms and KEGG pathways indicate gene sets deregulated in first-time infected malaria patients. GO terms related to antigenic variation and host cell remodeling are significantly overrepresented in the down-regulated gene set; only the KEGG pathway 03410 ‘base excision repair’ shows a significant up-regulation in malaria-naive patients (B). Log fold changes (logFC) for the 15 Plasmodium falciparum (P. falciparum) genes assigned to the KEGG pathway 03410 ‘base excision repair’ are plotted with the six significant hits marked with * p<0.05 and **p<0.01 (C).
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Estimated stage proportions for each sample.
Patient samples consist of a combination of different parasite stages. To estimate the proportion of different life cycle stages in each sample, a constrained linear model was fit using data from López-Barragán et al., 2011. The proportions of rings (8 hours post infection (hpi)), early trophozoites (19 hpi), late trophozoites (30 hpi), schizonts (42 hpi), and gametocyte stages shown in the columns of the bar plots must add to one for each sample. Shown are the comparisons between first-time infected (naive; blue) and pre-exposed samples (grey) (A) and severe (red) and non-severe cases (grey) (B). A bias towards the early trophozoite appears in the non-severe malaria sample group, which was confirmed by calculating the age in hours post infection (hpi) for each parasite sample. The boxes represent medians with IQR; the whiskers depict minimum and maximum values (range), with outliers located outside the whiskers (C, D). IQR, interquartile range.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Summary diagram of the approaches taken to analyze the RNA-seq data.
Diagram created in Lucidchart (http://www.lucidchart.com/).
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. The base excision repair (KEGG:03410) in P. falciparum.
Orthologues present in Plasmodium falciparum (P. falciparum) are indicated by gene IDs, and log fold changes (logFC) are indicated by colour codes (red: up-regulated; blue: down-regulated) (A). Summary of logFC in gene expression in first-time infected relative to pre-exposed patients and p-values for the logFC (B).
Figure 2—figure supplement 4.
Figure 2—figure supplement 4.. RNA quality.
The Bioanalyzer automated RNA electrophoresis system was used to characterize the total RNA quality prior to library synthesis. The calculated RNA integrity number (RIN) value is provided, although this measurement is questionable for samples from mixed species. From the four rRNA peaks visible in all samples, the inner peaks represent Plasmodium falciparum (P. falciparum) 18S and 28S rRNA, and the outer peaks are of human origin.
Figure 3.
Figure 3.. Summary of PfEMP1 transcripts, domains, and homology blocks that were found more or less frequently in malaria-naive and severely ill patients.
A schematic presentation of all var gene groups with their associated binding phenotypes and typical PfEMP1 domain compositions. The N-terminal head structure confers mutually exclusive receptor-binding phenotypes: EPCR (beige: CIDRα1.1/4–8), CD36 (turquoise: CIDRα2–6), CSA (yellow: VAR2CSA), and yet unknown phenotypes (brown: CIDRβ/γ/δ; dark red: CIDRα1.2/3 from VAR1, VAR3). Group A includes the conserved subfamilies VAR1 and VAR3, EPCR-binding variants, and those with unknown binding phenotypes conferred by CIDRβ/γ/δ domains. Group B PfEMP1 can have EPCR-binding capacities, but most variants share a four-domain structure, with group C-type variants capable of CD36 binding. Dual binders can be found within groups A and B, with a DBLβ domain after the first CIDR domain responsible for ICAM-1 (DBLβ1/3/5) or gC1qr binding (DBLβ12) (A). Transcripts, domains, and homology blocks according to Rask et al., 2010 as well as domain predictions from the DBLα-tag approach were found to be significantly differently expressed (p-value<0.05) between patient groups of both comparisons: first-time infected (blue) versus pre-exposed (black) cases and severe (red) versus non-severe (black) cases (B). ATS: acidic terminal sequence; CIDR: cysteine-rich interdomain region; CSA: chondroitin sulphate A; DBL: Duffy binding-like; DC: domain cassette; EPCR: endothelial protein C receptor; gC1qr: receptor for complement component C1q; ICAM-1: intercellular adhesion molecule 1; NTS: N-terminal segment; PAM: pregnancy-associated malaria; TM: transmembrane domain.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Differential expression of the var1 variants 3D7 and IT and var2csa between patient groups.
RNA-seq reads from each patient were normalized against the number of mappable reads to the 3D7 genome and aligned to the var1-3D7 and var1-IT variants as well as var2csa. The resulting bigwig files were displayed in Artemis (Carver et al., 2012). Individual samples are coloured according to the following patient groups: first-time infected in blue (A), severe in red (B), and the respective pre-exposed or non-severe samples in grey.
Figure 4.
Figure 4.. Expression differences between parasites from first-time infected and pre-exposed patients at the level of var gene transcripts, domains, and homology blocks determined by NGS.
RNA-seq reads of each patient sample were matched to de novo assembled var contigs with varying lengths, domains, and homology block compositions. Shown are significantly differently expressed var gene contigs (A, B) as well as PfEMP1 domain subfamilies (C–F) and homology blocks (G, H) from Rask et al., 2010, with an adjusted p-value of <0.05. Data are displayed as heat maps showing expression levels either in log-transformed normalized Salmon read counts (A) or in log transcripts per million (TPM) (C, G) for each individual sample. Box plots show median log-transformed normalized Salmon read counts (B) or TPM (D, F, H) and interquartile range (IQR) for each group of samples. Individual domains from inter-strain conserved tandem arrangements of domains, so-called domain cassettes (DCs), found significantly higher expressed in samples from first-time infected (blue arrow) and pre-exposed patients (grey arrow), are indicated in bold (E). The N-terminal head structure (NTS-DBLα-CIDRα/β/γ/δ) confers a mutually exclusive binding phenotype either to EPCR-, CD36-, CSA-, or an unknown receptor. Expression values of the N-terminal domains were summarized for each patient, and differences in the distribution among patient groups were tested using the Mann-Whitney U test (F). Normalized Salmon read counts for all assembled transcripts and TPM for PfEMP1 domains and homology blocks are available in Supplementary file 8.
Figure 5.
Figure 5.. Expression differences between parasites from severe and non-severe cases at the level of var gene transcripts, domains, and homology blocks determined by NGS.
RNA-seq reads of each patient sample were matched to de novo assembled var contigs with varying lengths, domains, and homology block compositions. Shown are significantly differently expressed var gene contigs (A, B) as well as PfEMP1 domain subfamilies (C–F) and homology blocks from Rask et al., 2010, with an adjusted p-value of <0.05 in severe (red) and non-severe patient samples (grey) (A, B). Data are displayed as heat maps showing expression levels either in log-transformed normalized Salmon read counts (A) or in log transcripts per million (TPM) (C, G) for each individual sample. Box plots show median log-transformed normalized Salmon read counts (B) or TPM (D, F, H) and interquartile range (IQR) for each group of samples. Individual domains from inter-strain conserved tandem arrangements of domains, so-called domain cassettes (DCs), found significantly higher expressed in severe (red arrow) and non-severe cases (grey arrow), are indicated in bold (E). The N-terminal head structure (NTS-DBLα-CIDRα/β/γ/δ) confers a mutually exclusive binding phenotype either to EPCR-, CD36-, CSA-, or an unknown receptor. Expression values of the N-terminal domains were summarized for each patient, and differences in the distribution among patient groups were tested using the Mann-Whitney U test (F). Normalized Salmon read counts for all assembled transcripts and TPM for PfEMP1 domains and homology blocks are available in Supplementary file 8.
Figure 6.
Figure 6.. Verification of RNA-seq results using DBLα-tag sequencing.
Amplified DBLα-tag sequences were blasted against the ~2400 genomes on varDB (Otto, 2019) to obtain subclassification into DBLα0/1/2 and prediction of adjacent head structure N-terminal segment (NTS) and cysteine-rich interdomain region (CIDR) domains and their related binding phenotype. Proportion of each NTS and DBLα subclass as well as CIDR domains grouped according to the binding phenotype (CIDRα1.1/4–8: EPCR-binding, CIDRα2–6: CD36-binding, CIDRβ/γ/δ: unknown binding phenotype/rosetting) was calculated and shown separately on the left, and the number of total reads and individual sequence cluster with n ≥ 10 sequences are shown on the right. Differences in the distribution among first-time infected (blue) and pre-exposed individuals (grey) (A) as well as severe (red) and non-severe cases (grey) (B) were tested using the Mann-Whitney U test. The boxes represent medians with IQR; the whiskers depict minimum and maximum values (range) with outliers located outside the whiskers.
Figure 6—figure supplement 1.
Figure 6—figure supplement 1.. Comparison of DBLα-tag sequencing with RNA-seq analysis.
DBLα-tag sequencing and RNA-seq data compared in Bland-Altman plots for all patients summarized (A) and for each individual patient (B), where the mean log expression of each gene is indicated on the x-axis and the log ratio between normalized DBLα-tag (% of reads) and RNA-seq values (% of RPKM from all contigs containing both DBLα-tag primer-binding sites) on the y-axis. The mean (equal to bias) of all ratios (line) and the confidence interval (CI) of 95% (dotted lines) are indicated. Data points with negative values for one of the approaches are displayed in dependence of their mean log expression on top (DBLα-tag sequence clusters not detected by RNA-seq) or bottom (RNA-seq contigs not found within DBLα-tag sequence cluster) of the graph.
Figure 6—figure supplement 2.
Figure 6—figure supplement 2.. Quantile regression analysis of Varia outputs.
Quantile regression was applied to look for differences between patient groups on the level of domain main classes (left) and subdomains (right). Shown are median differences with 95% confidence intervals of domains with values that unequal 0. Domains with positive values tend to be higher expressed in naive (A) and severe patients (B).
Figure 7.
Figure 7.. Correlation of var gene expression with antibody levels against head structure CIDR domains.
Patient plasma samples (n = 32) were subjected to Luminex analysis with 35 PfEMP1 head structure cysteine-rich interdomain region (CIDR) domains. The panel includes endothelial protein C receptor (EPCR)-binding CIDRα1 domains (n = 19), CD36-binding CIDRα2–6 domains (n = 12), and CIDR domains with unknown binding phenotypes (CIDRγ3: n = 1, CIDRδ1: n = 3) as well as the minimal binding region of VAR2CSA (VAR2). Box plots showing mean fluorescence intensities (MFI) extending from the 25th to the 75th percentile with a line at the median indicate the higher reactivity of the pre-exposed (A) and non-severe cases (B) with all PfEMP1 domains tested. Significant differences were observed for recognition of CIDRα2–6, CIDRδ1, and CIDRγ3; VAR2CSA recognition differed only between severe and non-severe cases (Mann-Whitney U test). Furthermore, the breadth of IgG recognition (%) of CIDR domains for the different patient groups was calculated and shown as a heat map (C).
Author response image 1.
Author response image 1.

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Andrade CM, Fleckenstein H, Thomson-Luque R, Doumbo S, Lima NF, Anderson C, Hibbert J, Hopp CS, Tran TM, Li S, Niangaly M, Cisse H, Doumtabe D, Skinner J, Sturdevant D, Ricklefs S, Virtaneva K, Asghar M, Homann MV, Turner L, Martins J, Allman EL, N'Dri ME, Winkler V, Llinás M, Lavazec C, Martens C, Färnert A, Kayentao K, Ongoiba A, Lavstsen T, Osório NS, Otto TD, Recker M, Traore B, Crompton PD, Portugal S. Increased circulation time of plasmodium falciparum underlies persistent asymptomatic infection in the dry season. Nature Medicine. 2020;26:1929–1940. doi: 10.1038/s41591-020-1084-0. - DOI - PubMed
    1. Argy N, Kendjo E, Augé-Courtoi C, Cojean S, Clain J, Houzé P, Thellier M, Hubert V, Deloron P, Houzé S, CNRP study group Influence of host factors and parasite biomass on the severity of imported plasmodium falciparum malaria. PLOS ONE. 2017;12:e0175328. doi: 10.1371/journal.pone.0175328. - DOI - PMC - PubMed
    1. Avril M, Tripathi AK, Brazier AJ, Andisi C, Janes JH, Soma VL, Sullivan DJ, Bull PC, Stins MF, Smith JD. A restricted subset of var genes mediates adherence of plasmodium falciparum-infected erythrocytes to brain endothelial cells. PNAS. 2012;109:E1782–E1790. doi: 10.1073/pnas.1120534109. - DOI - PMC - PubMed
    1. Avril M, Brazier AJ, Melcher M, Sampath S, Smith JD. DC8 and DC13 var genes associated with severe malaria bind avidly to diverse endothelial cells. PLOS Pathogens. 2013;9:e1003430. doi: 10.1371/journal.ppat.1003430. - DOI - PMC - PubMed

Publication types

Substances