Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 19;25(1):192.
doi: 10.1186/s12864-024-10103-w.

The genome and transcriptome of the snail Biomphalaria sudanica s.l.: immune gene diversification and highly polymorphic genomic regions in an important African vector of Schistosoma mansoni

Affiliations

The genome and transcriptome of the snail Biomphalaria sudanica s.l.: immune gene diversification and highly polymorphic genomic regions in an important African vector of Schistosoma mansoni

Tom Pennance et al. BMC Genomics. .

Abstract

Background: Control and elimination of schistosomiasis is an arduous task, with current strategies proving inadequate to break transmission. Exploration of genetic approaches to interrupt Schistosoma mansoni transmission, the causative agent for human intestinal schistosomiasis in sub-Saharan Africa and South America, has led to genomic research of the snail vector hosts of the genus Biomphalaria. Few complete genomic resources exist, with African Biomphalaria species being particularly underrepresented despite this being where the majority of S. mansoni infections occur. Here we generate and annotate the first genome assembly of Biomphalaria sudanica sensu lato, a species responsible for S. mansoni transmission in lake and marsh habitats of the African Rift Valley. Supported by whole-genome diversity data among five inbred lines, we describe orthologs of immune-relevant gene regions in the South American vector B. glabrata and present a bioinformatic pipeline to identify candidate novel pathogen recognition receptors (PRRs).

Results: De novo genome and transcriptome assembly of inbred B. sudanica originating from the shoreline of Lake Victoria (Kisumu, Kenya) resulted in a haploid genome size of ~ 944.2 Mb (6,728 fragments, N50 = 1.067 Mb), comprising 23,598 genes (BUSCO = 93.6% complete). The B. sudanica genome contains orthologues to all described immune genes/regions tied to protection against S. mansoni in B. glabrata, including the polymorphic transmembrane clusters (PTC1 and PTC2), RADres, and other loci. The B. sudanica PTC2 candidate immune genomic region contained many PRR-like genes across a much wider genomic region than has been shown in B. glabrata, as well as a large inversion between species. High levels of intra-species nucleotide diversity were seen in PTC2, as well as in regions linked to PTC1 and RADres orthologues. Immune related and putative PRR gene families were significantly over-represented in the sub-set of B. sudanica genes determined as hyperdiverse, including high extracellular diversity in transmembrane genes, which could be under pathogen-mediated balancing selection. However, no overall expansion in immunity related genes was seen in African compared to South American lineages.

Conclusions: The B. sudanica genome and analyses presented here will facilitate future research in vector immune defense mechanisms against pathogens. This genomic/transcriptomic resource provides necessary data for the future development of molecular snail vector control/surveillance tools, facilitating schistosome transmission interruption mechanisms in Africa.

Keywords: Balancing selection; Biomphalaria choanomphala; Biomphalaria sudanica; De novo genome assembly; Gene family evolution; Immunogenetics; Pathogen recognition; Polymorphism; Schistosomiasis; Snail vector.

PubMed Disclaimer

Conflict of interest statement

Not applicable.

Figures

Fig. 1
Fig. 1
Species tree generated in Orthofinder using the Species Tree of All Genes (STAG) algorithm [68, 75]. Root is time calibrated to be 20 Million Years Ago based on the appearance of Bulinus in the fossil record [70]. Node support values represent the bipartition proportions in each of the individual species tree estimates. Branch lengths represent the average number of substitutions per site across all the individual trees inferred from each gene family. The number of (significant/total) gene families expanded (blue) and contracted (red) in the ancestral populations of the Biomphalaria species, and outgroups Elysia marginata and Bulinus truncatus as determined in CAFE 5 [69] are shown for each internal and terminal node (< 0 > to < 13 >)
Fig. 2
Fig. 2
Maximum likelihood tree of C-type lectin-related proteins (CREPs) identified from Biomphalaria sudanica in the current study (see Supplementary Table 14) and four CREPs identified previously from B. glabrata (see Supplementary Table 15) organized within hierarchical orthogroups (HOGs) as determined by Orthofinder [68]. Branch lengths represent the number of substitutions per site. Nodes with bootstrap values > 75 (estimated with 1000 replicates of non-parametric bootstrap) are signified by a red dot on the branch before bipartition. Red stars indicate three CREPs with unusual features, namely that they include weak hits for secondary immunoglobulin domains, which may overlap with C-lectin domains as well as containing a large interdomain region (see Supplementary Table 14)
Fig. 3
Fig. 3
Maximum likelihood tree of the 57 fibrinogen-related proteins (FREPs) identified from Biomphalaria sudanica (bold) in the current study (see Supplementary Table 14) amongst reference sequences for FREPs identified previously from B. glabrata (see Supplementary Table 15) organized within hierarchical orthogroups (HOGs) as determined by Orthofinder [68]. Branch lengths represent the number of substitutions per site. Nodes with bootstrap values > 75 (estimated with 1000 replicates of non-parametric bootstrap) are signified by a red dot on the branch before bipartition. Red stars indicate FREPs with unusual features according to our annotation summarized in Supplementary Table 14, including those containing weak hits for additional immunoglobulin (IgSF) domains (e.g. BSUD.16968), IgSF rearrangements (e.g. BSUD.21927) or containing partial fibrinogen domains (e.g. BSUD.19120)
Fig. 4
Fig. 4
Mean nucleotide diversity of five Biomphalaria sudanica inbred line genomes in windows of 100 kb (0 kb stagger). Linkage groups (LG1-LG18) for B. sudanica are inferred from B. glabrata [6], showing the hypothesized chromosomal position of contigs in B. sudanica. Notable clusters of highly diverse genomic regions and genes were seen in B. sudanica linkage groups LG6, LG10 and LG16 (highlighted in turquoise and red peaks). Peaks in red represent regions containing candidate immune loci that are orthologous to some of those previously associated with Schistosoma mansoni resistance in B. glabrata (PTC1, PTC2, Catalase, BgTLR, RADres, see Supplementary Table 2)
Fig. 5
Fig. 5
A Patterns of nucleotide diversity across genomes of five inbred lines of Biomphalaria sudanica, highlighting two polymorphic transmembrane gene clusters that are orthologous to those previously associated with Schistosoma mansoni resistance in B. glabrata (PTC1 [41] and PTC2 [42]). B Genome-wide nucleotide diversity across overlapping 100-kb genomic windows (starting at 0-kb and 50-kb intervals) with windows on contigs 359 (PTC1) and 208 (PTC2) that occur in the top 1% of genome-wide nucleotide diversity between inbred lines being colored blue and red, respectively. C Genome-wide nucleotide diversity (purple line) and pairwise divergence for each haplotype pair (grey lines) across contigs 359 (PTC1) and 208 (PTC2). PTC1 and PTC2 regions are indicated by the blue and red bars, respectively. Similarly diverse regions span across several megabases of contig 208 in B. sudanica. Even in diverse regions, pairwise divergence can be near zero, indicating shared haplotypes
Fig. 6
Fig. 6
Orthology of Biomphalaria sudanica contigs/scaffolds (grey boxes, with different shades of grey representing alternating contigs/scaffolds) to B. glabrata scaffolds (blue boxes, [6]) pertaining to linkage groups (LG) inferred from three B. glabrata linkage groups, highlighted here because they are notably enriched for both diverse regions/genes and orthologous candidate immune genes from B. glabrata (PTC1, cat, BgTLR, sod1, prx4, RADres and PTC2), positions of which are shown. Contigs/scaffolds with candidate genes and/or diverse regions (100 kb windows in top 1%, light red boxes, or top 0.1%, dark red boxes) are labeled. Synteny is relatively high between species, except for a large rearrangement on LG16 (see Supplementary Fig. 3C)
Fig. 7
Fig. 7
Allelic phylogenies of Biomphalaria sudanica, B. pfeifferi (Bpfe), and B. glabrata (Bgla), rooted with B. straminea (Bstr). (A) Polymorphic transmembrane cluster 1 (PTC1) gene 2 (i.e. grctm2 [41]) shows a tandem duplication (see A and B duplicates for relevant taxa indicated in bold) in the African species (A = BSUD.8884 and B = BSUD.8885 for B. sudanica) that is clearly independent of similar duplications seen in B. glabrata haplotype 1 (Bgla1) and B. straminea. Bgla1, Bgla2 and Bgla3 refer to PTC1 haplotypes sequenced from B. glabrata [41]. (B) PTC2 gene 4 (BSUD.3979 in B. sudanica) shows distinct haplotypes in each Biomphalaria species. Bgla1, Bgla2 and Bgla3 refer to PTC2 haplotypes sequenced from B. glabrata [42]
Fig. 8
Fig. 8
Phylogenetic trees generated using RAxML [82] of exemplar genes showing unusually high diversity in Biomphalaria sudanica, alongside orthologous alleles in B. pfeifferi (Bpfe) and B. glabrata (Bgla), and rooted with B. straminea (Bstr). Bgla1 and Bgla2 refer to alleles sequenced from B. glabrata inbred lines [42], whilst BglaBS90 and BglaM represent alleles from two other B. glabrata inbred lines [6]. (A) BSUD.4885 (contig 2266, linkage group (LG) 10) is a gene with exceptionally high diversity, has a protein structure that is similar to genes previously inferred as pathogen recognition receptors, and shows an apparent trans-species polymorphism of divergent haplotypes in African and South American snails. (B) BSUD.12903 (contig 499, LG16) is another pathogen recognition receptor candidate, in one of the most polymorphic contigs in the B. sudanica genome, predicted to contain C-type lectin, immunoglobulin, TMEM154 and alternatively expressed fibronectin III domains (see Fig. 10)
Fig. 9
Fig. 9
Mean nucleotide diversity (purple line) and pairwise divergence for each haplotype pair (grey lines) across portions of four contigs in LG16 (contig 499 300-850 kb, contig 550 0–250 kb, scaffold 3064 2300–3000 kb) and LG5 (contig 676 150–850 kb) showing high diversity and clusters of transmembrane genes. All plotted genes (shown in red, blue and black) encode single-pass transmembrane proteins (TM1); other genes in these regions are not shown. Key functional domains potentially involved in pathogen recognition include C-type lectin (CTL), fibronectin type III (FN3) and immunoglobulin (Ig), and TMEM154, which is a membrane-spanning domain also found in several polymorphic transmembrane cluster 1 (PTC1) genes [41]. Genes shown in red, including BSUD.12903 (Fig. 10) all have at least three of these four functional domains, while genes shown in blue have only Ig and genes shown in black represent other genes encoding TM1 and various other protein domains
Fig. 10
Fig. 10
Detailed view of protein BSUD.12903 containing the alternatively expressed exon of 741 aa, which demonstrates that in the extracellular region nonsynonymous diversity greatly exceeds synonymous diversity (shown here in 50 aa sliding windows) in regions where functional domains potentially involved in pathogen recognition are present (C-type lectin, immunoglobulin, and fibronectin III (FN3) domains). Two inbred lines of Biomphalaria sudanica (Bs111 and Bs5-2) have multiple nonsense variants (stop codon or frameshift) in a single exon containing a FN3 domain, which is expressed in the B. pfeifferi ortholog (KAK0057508 [8]), suggesting that FN3 may not be expressed in all B. sudanica lines. The FN3 exon also occurs in the B. glabrata ortholog BGLB024560, where it is also variably either expressed (NCBI Accession XM_056010427) or excluded (NCBI Accession SRX8534561). Both FN3 and/or TMEM154 domains are present in B. sudanica genes BSUD.8884, BSUD.8874 and BSUD.8876 that are orthologous to B. glabrata PTC1 region genes associated with schistosome resistance: grctm2, grctm3 and grctm4 [41]

Update of

Similar articles

Cited by

References

    1. Hotez PJ, Daar AS. The CNCDs and the NTDs: Blurring the Lines Dividing Noncommunicable and Communicable Chronic Diseases. PLoS Negl Trop Dis. 2008;2(10):e312. doi: 10.1371/journal.pntd.0000312. - DOI - PMC - PubMed
    1. WHO. Schistosomiasis: Key facts. 2023 [cited 2023 Apr 14]. Available from: https://www.who.int/news-room/fact-sheets/detail/schistosomiasis.
    1. Castillo MG, Humphries JE, Mourão MM, Marquez J, Gonzalez A, Montelongo CE. Biomphalaria glabrata immunity: Post-genome advances. Dev Comp Immunol. 2020;104:103557. doi: 10.1016/j.dci.2019.103557. - DOI - PMC - PubMed
    1. Mitta G, Gourbal B, Grunau C, Knight M, Bridger JM, Théron A. The Compatibility Between Biomphalaria glabrata Snails and Schistosoma mansoni: An Increasingly Complex Puzzle. Adv Parasitol. 2017;97:111–145. doi: 10.1016/bs.apar.2016.08.006. - DOI - PubMed
    1. Adema CM, Hillier LW, Jones CS, Loker ES, Knight M, Minx P, et al. Whole genome analysis of a schistosomiasis-transmitting freshwater snail. Nat Commun. 2017;8:15451. doi: 10.1038/ncomms15451. - DOI - PMC - PubMed