Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 24;17(3):e0011208.
doi: 10.1371/journal.pntd.0011208. eCollection 2023 Mar.

A genome sequence for Biomphalaria pfeifferi, the major vector snail for the human-infecting parasite Schistosoma mansoni

Affiliations

A genome sequence for Biomphalaria pfeifferi, the major vector snail for the human-infecting parasite Schistosoma mansoni

Lijing Bu et al. PLoS Negl Trop Dis. .

Abstract

Background: Biomphalaria pfeifferi is the world's most widely distributed and commonly implicated vector snail species for the causative agent of human intestinal schistosomiasis, Schistosoma mansoni. In efforts to control S. mansoni transmission, chemotherapy alone has proven insufficient. New approaches to snail control offer a way forward, and possible genetic manipulations of snail vectors will require new tools. Towards this end, we here offer a diverse set of genomic resources for the important African schistosome vector, B. pfeifferi.

Methodology/principal findings: Based largely on PacBio High-Fidelity long reads, we report a genome assembly size of 772 Mb for B. pfeifferi (Kenya), smaller in size than known genomes of other planorbid schistosome vectors. In a total of 505 scaffolds (N50 = 3.2Mb), 430 were assigned to 18 large linkage groups inferred to represent the 18 known chromosomes, based on whole genome comparisons with Biomphalaria glabrata. The annotated B. pfeifferi genome reveals a divergence time of 3.01 million years with B. glabrata, a South American species believed to be similar to the progenitors of B. pfeifferi which undertook a trans-Atlantic colonization < five million years ago.

Conclusions/significance: The genome for this preferentially self-crossing species is less heterozygous than related species known to be preferential out-crossers; its smaller genome relative to congeners may similarly reflect its preference for selfing. Expansions of gene families with immune relevance are noted, including the FReD gene family which is far more similar in its composition to B. glabrata than to Bulinus truncatus, a vector for Schistosoma haematobium. Provision of this annotated genome will help better understand the dependencies of trematodes on snails, enable broader comparative insights regarding factors contributing to susceptibility/ resistance of snails to schistosome infections, and provide an invaluable resource with respect to identifying and manipulating snail genes as potential targets for more specific snail control programs.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. A typical eukaryote genome annotation workflow of protein coding gene annotation for the B. pfeifferi genome.
The workflow contains three stages (boxes): repeat-masking (green), gene model prediction (blue) and functional annotation (purple). The intermediate and final results were labeled in boxes with red frames. The analytical processes were marked out in black-colored text and bioinformatics tools marked in orange-colored text.
Fig 2
Fig 2. Nomenclature for B. pfeifferi Fibrinogen-Related Domain-containing proteins (FReDs).
The FReD family is an inclusive, more general group containing the following subgroups: 1) sFReDs, or single FBG FReDs (containing FBG domain only); 2) FREPs, with 1 or 2 IgSF(s) joined via an interceding region (ICR) to a single FBG domain; and 3) FReMs (consisting of an epidermal growth factor domain (EGF) and a FBG domain. FREPs are therefore a sub-category within the FReD family. The domain structures for three main classes of known FReDs (green for sFReDs, yellow for FREPs and purple for FReMs) are summarized on the left, edited from Lu et. al (2020). The nomenclature logic process was displayed as a decision tree on the right. Naming convention tests (yes, no, or other) were made based on the test questions in pink boxes with gray frames. Final names were assigned into 5 classes: FReM, FREP1-14, FREP15-24 (novel FREPs), sFReD1-14, and sFReDn1-n29 (new sFReDs).
Fig 3
Fig 3. Merqury copy number spectrum plots show low heterozygosity of the B. pfeifferi genome compared to the B. glabrata BB02 genome.
Histograms were colored by copy numbers: k-mers present in reads but missing in genome assembly were in grey; k-mers present in both raw reads and genome assemblies were colored in red (1x), blue (2x), green (3x), purple (4x) and orange (>4x). A single peak (red) was found in (A) PacBio HiFi reads for B. pfeifferi, whereas both a red and blue peak were obtained in (B), the Illumina paired end reads for B. glabrata.
Fig 4
Fig 4. Functional annotation for protein coding gene models predicted in the B. pfeifferi genome.
Fig 5
Fig 5. Synteny plot and structural variations (SVs) between assemblies of B. pfeifferi contigs as query matched against the 18 linkage groups of B. glabrata iM line.
A) Parallel synteny view between B. pfeifferi (query in orange color) and B. glabrata iM line (reference in blue color). Reference and query genome were marked in blue and brown colors. Regions with synteny and structure variations detected by SyRI were highlighted in colored lines: Syntenic (grey), Inversion (orange), Translocation (green), and Duplication (blue). B) Length distribution in violin plots for structural variation (> 1Kb) of B. pfeifferi in comparison to B. glabrata iM line. SV types were highlighted in different colors. CPG: Copy gain in query, CPL: Copy loss in query, DEL: Deletion in query, DUP: Duplicated region, INS: Insertion in query, INV: Inverted region, INVDP: Inverted duplicated region, INVTR: Inverted translocated region, TDM: Tandem repeat, TRANS: Translocated region. Inside violin plots, box plots were placed to indicate mean, quartiles and outliers (black dots). C) Sequential synteny view between B. pfeifferi (query in orange color) and B. glabrata iM line (reference in blue color). Separated scaffolds inside each linkage group were marked out as grey and black boxes. Synteny blocks between two genomes are laid out in grey blocks in the background. Particularly noteworthy is the inversion indicated on LG 9.
Fig 6
Fig 6. Inversion identified on inferred LG9 between B. pfeifferi and B. glabrata iM line genomes.
A) A 2.3 Mbp size inversion region was identified. The scaffolds covering the linkage groups are BP005 and BGM004, respectively. B) Dot plots show the inversion area was between large flanking synteny blocks. C) ShinySyn plot showing the sequence similar genes connected by blue ribbons, revealing the reverse order within the inverted region. Genes were marked in colored boxes along the x axes, purple boxes for positive strand and green boxes for negative strand. Boxes without connections are genes with no sequence similarity identified between the two species.
Fig 7
Fig 7. Time tree computed based on species tree with branch length and a constraint point of Bulinus emerging 20 Mya, based on the fossil record.
The International Nucleotide Sequence Database Collaboration (INSDC) genome IDs on NCBI were listed in the brackets beside each species.
Fig 8
Fig 8. Maximum likelihood tree of FReDs from B. pfeifferi (blue) and B. glabrata BB02 (black).
The ML tree with 1000 bootstrap replicates was constructed with full length FReDs protein sequences from B. pfeifferi (this study) and B. glabrata BB02. Bootstrap values equal or greater than 75% are represented by black squares on the internal nodes. For nomenclature of B. pfeifferi FReDs see descriptions in this study. Nomenclature of B. glabrata BB02 FReDs was described in previous studies [12,55,74,75,77,78].
Fig 9
Fig 9. The 3D view of BpFREPs 2, 3.1, 3.2 and BgFREP3.2, predicted by AlphaFold2.
Abbreviations: SP, signal peptide; IgSF, immunoglobulin superfamily; ICR, interceding region; FBG, fibrinogen domain. Secondary structures were marked in colored cartoons: alpha helices (3.6 residues per helix turn and 13 atoms ring) in purple, beta sheet in yellow, coils in light grey, beta turn in light blue, and 3(10) helices (3 residues per helix turn and 10 atoms ring) in dark blue.

References

    1. Brown DS. Freshwater snails of Africa and their medical importance. Revised 2nd. London: Taylor & Francis; 1994. doi: 10.1201/9781482295184 - DOI
    1. DeJong RJ, Morgan JAT, Wilson WD, Al-Jaser MH, Appleton CC, Coulibaly G, et al.. Phylogeography of Biomphalaria glabrata and B. pfeifferi, important intermediate hosts of Schistosoma mansoni in the New and Old World tropics. Mol Ecol. 2003;12: 3041–3056. doi: 10.1046/j.1365-294X.2003.01977.x - DOI - PubMed
    1. Colley DG, Bustinduy AL, Secor WE, King CH. Human schistosomiasis. Lancet. 2014;383: 2253–2264. doi: 10.1016/S0140-6736(13)61949-2 - DOI - PMC - PubMed
    1. World Health Organization. Schistosomiasis and soil- transmitted helminthiases: progress report, 2020. Wkly Epidemiol Rec. 2021;96: 593–594. Available from: www.whocc.ita116.unina.it.
    1. King CH, Dickman K, Tisch DJ. Reassessment of the cost of chronic helmintic infection: a meta-analysis of disability-related outcomes in endemic schistosomiasis. Lancet. 2005;365: 1561–1569. doi: 10.1016/S0140-6736(05)66457-4 - DOI - PubMed

Publication types