Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Dec 5;7(12):e2569.
doi: 10.1371/journal.pntd.0002569. eCollection 2013.

De novo assembly of a field isolate genome reveals novel Plasmodium vivax erythrocyte invasion genes

Affiliations
Comparative Study

De novo assembly of a field isolate genome reveals novel Plasmodium vivax erythrocyte invasion genes

James Hester et al. PLoS Negl Trop Dis. .

Abstract

Recent sequencing of Plasmodium vivax field isolates and monkey-adapted strains enabled characterization of SNPs throughout the genome. These analyses relied on mapping short reads onto the P. vivax reference genome that was generated using DNA from the monkey-adapted strain Salvador I. Any genomic locus deleted in this strain would be lacking in the reference genome sequence and missed in previous analyses. Here, we report de novo assembly of a P. vivax field isolate genome. Out of 2,857 assembled contigs, we identify 362 contigs, each containing more than 5 kb of contiguous DNA sequences absent from the reference genome sequence. These novel P. vivax DNA sequences account for 3.8 million nucleotides and contain 792 predicted genes. Most of these contigs contain members of multigene families and likely originate from telomeric regions. Interestingly, we identify two contigs containing predicted protein coding genes similar to known Plasmodium red blood cell invasion proteins. One gene encodes the reticulocyte-binding protein gene orthologous to P. cynomolgi RBP2e and P. knowlesi NBPXb. The second gene harbors all the hallmarks of a Plasmodium erythrocyte-binding protein, including conserved Duffy-binding like and C-terminus cysteine-rich domains. Phylogenetic analysis shows that this novel gene clusters separately from all known Plasmodium Duffy-binding protein genes. Additional analyses showing that this gene is present in most P. vivax genomes and transcribed in blood-stage parasites suggest that P. vivax red blood cell invasion mechanisms may be more complex than currently understood. The strategy employed here complements previous genomic analyses and takes full advantage of next-generation sequencing data to provide a comprehensive characterization of genetic variations in this important malaria parasite. Further analyses of the novel protein coding genes discovered through de novo assembly have the potential to identify genes that influence key aspects of P. vivax biology, including alternative mechanisms of human erythrocyte invasion.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Sequence coverage for all de novo assembled contigs.
The figure shows the average sequence coverage (y-axis, in read per base) of each de novo assembled C127 contig according to the contig length (x-axis, in bp). Six short (<200 bp) contigs displayed coverage greater than 2,000 reads by base and are not shown on this figure.
Figure 2
Figure 2. Analysis of vir genes in the C127 contigs.
(A) Phylogenetic tree showing the relationships between the protein coding sequences of vir genes from the Salvador I reference genome (solid dots) and those predicted from the C127 contigs (branches without dots at the tips). Annotated vir genes (solid dots) are colored according to their subfamilies. Nodes used to assign predicted C127 vir genes into subfamilies are shown by the colored branches derived from them. (B) Proportion of genes assigned to each major vir subfamily for Salvador I (empty bars) and C127 (black bars).
Figure 3
Figure 3. Novel P. vivax Reticulocyte-binding protein gene.
(A) Next-generation sequencing read coverage along the ∼30 kb contig assembled from the C127 sample. The samples displayed are, from top to bottom: C127, C08, M08, M15, Brazil I, Mauritania I, North Korea, Belem, Chesson and Salvador I. The bottom track shows, in grey, predicted protein coding genes and in black, the position of the predicted reticulocyte-binding protein gene (also highlighted by the red box). Note that there is no coverage of the contig in Salvador I. (B) Phylogenetic tree showing the relationships among protein sequences of P. vivax, P. cynomolgi, P. simiovale and P. knowlesi RBP genes. The position of the predicted P. vivax RBP2e gene is highlighted by the red arrow.
Figure 4
Figure 4. Novel P. vivax Erythrocyte-binding protein gene.
(A) Sequence coverage along the 80 kb contig containing a novel predicted P. vivax Erythrocyte-Binding Protein gene. The bottom track shows, in grey, predicted protein coding genes and, in black, the position of the predicted EBP gene (also highlighted by the red box). The upper panels display next-generation sequencing read coverage along the contig sequence. The samples are from top to bottom: C127, C08, M08, M15, Brazil I, Mauritania I, North Korea, Belem, Chesson and Salvador I. Note that in the Salvador I sample, there are essentially no reads mapping to this contig. (B) Phylogenetic tree showing the relationships among EBP protein sequences from P. vivax, P. cynomolgi, P. simiovale and P. knowlesi. The position of the novel predicted P. vivax EBP gene is highlighted by the red arrow. (C) Comparison of the protein domain annotations for the novel predicted EBP gene (top) and the known P. vivax DBP gene. The red box indicates the signal sequence; the blue box the Duffy-binding like domain; the yellow box the C-terminus cysteine-rich domain and the green box the transmembrane domain. (D) Amino acid alignment of the Duffy-binding like domain for different Plasmodium DBP genes and the novel P. vivax and P. cynomolgi EBP genes. The alignment shows the amino acid positions 215–509 of PvDBP (the section of the alignment not displayed corresponds to amino acid 303–371). The grey boxes indicate conserved cysteine positions. The red boxes indicate the positions of two additional cysteines present in the novel EBP genes. (E) The novel P. vivax EBP gene is expressed in blood-stage parasites. The gel picture shows the PCR products for the novel EBP gene (left) and known DBP gene (right) amplified from cDNA of the Belem strain. Note that, for both genes, the primers are located on either side of an intron and that the product size is consistent with amplification of cDNA molecules and excludes DNA contamination.

References

    1. Wertheimer SP, Barnwell JW (1989) Plasmodium vivax interaction with the human Duffy blood group glycoprotein: identification of a parasite receptor-like protein. Exp Parasitol 69: 340–350. - PubMed
    1. Fang XD, Kaslow DC, Adams JH, Miller LH (1991) Cloning of the Plasmodium vivax Duffy receptor. Mol Biochem Parasitol 44: 125–132. - PubMed
    1. Galinski MR, Medina CC, Ingravallo P, Barnwell JW (1992) A reticulocyte-binding protein complex of Plasmodium vivax merozoites. Cell 69: 1213–1226. - PubMed
    1. Collins WE (2002) Nonhuman Primate Models. In: Doolan DL, editor. Malaria Methods and Protocols: Methods and Protocols. Totowa, New Jersey: Humana Press. pp. 77–84.
    1. Neafsey DE, Galinsky K, Jiang RH, Young L, Sykes SM, et al. (2012) The malaria parasite Plasmodium vivax exhibits greater genetic diversity than Plasmodium falciparum. Nat Genet 44: 1046–1050. - PMC - PubMed

Publication types

Associated data