Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 3;26(1):558.
doi: 10.1186/s12864-025-11690-y.

Chromosome-level genome assembly and anticoagulant protein annotation of the buffalo leech Hirudinaria bpling (Hirudinea: Hirudinidae)

Affiliations

Chromosome-level genome assembly and anticoagulant protein annotation of the buffalo leech Hirudinaria bpling (Hirudinea: Hirudinidae)

Muhammad Salabat Khan et al. BMC Genomics. .

Abstract

This study aimed to obtain and analyze the chromosome-level genome assembly of Hirudinaria bpling, a species vital for aquatic ecosystem health and medical research. Understanding its genomic information is crucial for advancing its medical applications and elucidating its ecological role. We assembled the genome of H. bpling using a combination of PacBio HiFi long reads, Illumina sequencing, and Hi-C chromosome conformation capture techniques. This approach allowed us to achieve a high-resolution genome assembly with detailed chromosomal organization. The final genome assembly of H. bpling is 144.08 Mb, with an N50 size of 11.27 Mb, anchored onto thirteen pseudo-chromosomes. BUSCO analysis indicated a genome completeness of 96.20%. We annotated a total of 20,126 protein-coding genes and identified 18.80% repetitive elements within the genome. Phylogenetic analysis included nine other leech species, positioning H. bpling as a sister taxon to Hirudo manillensis. A comparative analysis focused on the identification of putative anticoagulant proteins (e.g. Hirudin, Antistasin, Hirustasin, Therostasin, Bdellastasin, Guamerin/Piguamerin, Gelin, Bplins, Saratin, Eglin C, Bdellin B-3, LDTI, Hyaluronidase, Destabilase, Apyrase, Leech carboxypeptidase inhibitor, Gamma-glutamyl transpeptidase, Lefaxin, Progranulin), identifying conserved regions and evolutionary relationships among these proteins across different leech species. As a medically significant species, H. bpling offers promising opportunities for research into anticoagulant therapies. This study provides a comprehensive genomic and phylogenetic analysis of H. bpling, offering new insights into leech genomics and the evolution of anticoagulant genes. The findings enhance our understanding of the genetic and evolutionary mechanisms underlying anticoagulant production in leeches.

Keywords: Hirudinaria bpling; Anticoagulant proteins; Genomic sequencing; Phylogenetic analysis.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The circos plot presented in Fig. 1 provides a comprehensive visualization of various genomic features across the chromosomes of H. bpling. The outermost layer (a) represents the chromosomes, labelled from Chr1 to Chr13, each uniquely colored for differentiation. Moving inward, the next layer (b) illustrates the GC skew across these chromosomes, highlighting regions of nucleotide imbalance, which may indicate replication origins and terminators. The subsequent layer (c) shows the distribution of Long Interspersed Nuclear Elements (LINEs), a type of transposable element, with bars indicating their density along the chromosomes. Following this, the layer (d) represents the presence of Long Terminal Repeats (LTRs), another category of transposable elements, visualized by peaks denoting their abundance. The innermost layer (e) depicts the locations of small nuclear RNA (snRNA) genes, which are crucial for pre-mRNA splicing, with specific regions marked by their occurrence. The central image provides context to the genomic data, linking it directly to the organism under study
Fig. 2
Fig. 2
Schematic representation of gene composition of Hirudinaria bpling (H. bpling) compared to Hirudinaria manillensis (H. manillensis), Whitmania pigra (W. pigra), Helobdella robusta (H. robusta), Lamellibrachia satsuma (L. satsuma), Metaphire vulgaris (M. vulgaris), Aedes albopictus (A. albopictus), Haemaphysalis longicornis (H. longicornis), Necator americanus (N. americanus), and Toxocara canis (T. c), respectively. The x-axis lists the species whereas the y-axis represents the total number of genes. Different colors indicate gene categories: single-copy (red), multi-copy (blue), species-specific orthogroups (yellow), unassigned genes (green), and other genes (purple)
Fig. 3
Fig. 3
Schematic representation of a segment on genome scaffold 4 of H. bpling comprising three putative hirudin/HLF genes (HV1–V3) and five tandem hirudin genes (TH1 and TH2a-d). Red arrows indicate position, size and orientation of the respective genes. The upper boxes illustrate the structures of two genes (left: putative hirudin Hbpl_HV1; right: tandem hirudin Hbpl_TH2 d). Exons (E) are labelled in green, and introns (I) are labelled in red
Fig. 4
Fig. 4
Schematic representation of gene composition of Hirudinaria bpling (H. bpling) compared to Hirudinaria manillensis (H. manillensis), Whitmania pigra (W. pigra), Helobdella robusta (H. robusta), Lamellibrachia satsuma (L. satsuma), Metaphire vulgaris (M. vulgaris), Aedes albopictus (A. albopictus), Haemaphysalis longicornis (H. longicornis), Necator americanus (N. americanus), and Toxocara canis (T. c), respectively. The x-axis lists the species whereas the y-axis represents the total number of genes. Different colors indicate gene categories: single-copy (red), multi-copy (blue), species-specific orthogroups (yellow), unassigned genes (green), and other genes (purple)
Fig. 5
Fig. 5
Schematic representation of a segment on genome scaffold 4 of H. bpling comprising three putative hirudin/HLF genes (HV1–V3) and five tandem hirudin genes (TH1 and TH2a-d). Red arrows indicate position, size and orientation of the respective genes. The upper boxes illustrate the structures of two genes (left: putative hirudin Hbpl_HV1; right: tandem hirudin Hbpl_TH2 d). Exons (E) are labelled in green, and introns (I) are labelled in red
Fig. 6
Fig. 6
Multiple amino acid sequence alignment of putative hirudins/hirudin-like factors/tandem-hirudins of H. bpling and archetype hirudins of H. medicinalis (Hmed_HV1), H. manillensis (Hman_HM1), H. nipponia (Hnip), M. decora (Mdec), H. orientalis (Hori_HV3), W. pigra (Wpig_HV4a) and archetype tandem-hirudins of P. manillensis (PmanTH) and H. manillensis (HmanTH). Black background indicates fully conserved residues; gray background indicates partially conserved residues. The signal peptide sequence is underlined and the conserved aromatic amino acid residue at position 3 of the mature protein is marked with an arrow. The six conserved cysteine residues giving rise to the three dimensional structure are marked in bold and red and the central globular domain is highlighted in green or cyan. Abbreviations are used according to the IUPAC code
Fig. 7
Fig. 7
Multiple amino acid sequence alignment of putative antistasins of H. bpling, the archetype antistasin of H. officinalis and a putative antistasin of H. medicinalis (GenBank Acc. No. WPM04214). Black background indicates fully conserved residues; gray background indicates partially conserved residues. The signal peptide sequence region is underlined, the 10 conserved cysteine of the antistasin domain are marked in bold and red and the antistasin domain is highlighted in green or cyan. Abbreviations are used according to the IUPAC code
Fig. 8
Fig. 8
Multiple amino acid sequence alignment of putative gelins of H. bpling, H. medicinalis, H. verbana and the archetype gelin of H. manillensis. Black background indicates fully conserved residues; gray background indicates partially conserved residues. The signal peptide sequence region is underlined, the 10 conserved cysteine of the antistasin domain are marked in bold and red and the antistasin domain is highlighted in green. Abbreviations are used according to the IUPAC code
Fig. 9
Fig. 9
Multiple amino acid sequence alignment of putative saratins of H. bpling (Hbpl_Sar1-3 and Hbpl_TSar), the archetype saratin of H. medicinalis (Hmed_Sar) and a putative saratin of H. nipponoia (Hnip_Sar). Black background indicates fully conserved residues; gray background indicates partially conserved residues. The signal peptide sequence region is underlined, the six conserved cysteine of the saratin domain are marked in bold and red and the saratin domain is highlighted in green or cyan. Abbreviations are used according to the IUPAC code
Fig. 10
Fig. 10
Multiple amino acid sequence alignment of putative LDTI variants of H. bpling (Hbpl_LDTI and Hbpl_mLDTI1-3), the LDTI variants of H. medicinalis (Hmed_LDTI and Hmed_mLDTI1-2) and a putative LDTI of H. manillensis (Hman_LDTI). Black background indicates fully conserved residues; gray background indicates partially conserved residues. The signal peptide sequence region is underlined, the six conserved cysteine of the bdellin domain are marked in bold and red and the bdellin domain is highlighted in green or cyan. The part of Hmed_LDTI that represents the sequence of the archetype LDTI (GenBank Acc. No. AAB33769.1) is boxed in red. Abbreviations are used according to the IUPAC code
Fig. 11
Fig. 11
Schematic drawing of the progranulin gene structure of H. bpling. Green colour and N indicate an N-exon; yellow colour and C indicate a C exon; olive colour indicates the signal peptide sequence; cyan colour indicates the 4-cysteine-element; the numbers 1–14 indicate the putative granulin modules
Fig. 12
Fig. 12
Phylogenetic trees based on gene (a), cDNA (b) and amino acid (c) sequences of hirudins, HLFs and tandem hirudins of H. bpling (Hbpl), H. medicinalis (Hmed), H. verbana (Hver), H. orientalis (Hori), H. nipponia (Hnip), W. pigra (Wpig), H. manillensis (Hman), P. manillensis (Pman) and M. decora (Mdec), respectively. Colors indicate the different clades and their distribution within and among the trees. The bars indicate the distances

Similar articles

References

    1. Phillips AJ, Siddall ME. Poly-paraphyly of Hirudinidae: many lineages of medicinal leeches. BMC Evol Biol. 2009;9:1–11. - PMC - PubMed
    1. Lai Y-T, Chen J-H. Leech fauna of Taiwan. Taiwan: National Taiwan University Press Taipei; 2010.
    1. Wagenaar DA. A classic model animal in the 21st century: recent lessons from the leech nervous system. J Exp Biol. 2015;218(21):3353–9. - PubMed
    1. Marden JN, McClure EA, Beka L, Graf J. Host matters: medicinal leech digestive-tract symbionts and their pathogenic potential. Front Microbiol. 2016;7:1569. - PMC - PubMed
    1. Elliott JM, Kutschera U. Medicinal leeches: historical use, ecology, genetics and conservation. Freshwater Reviews. 2011;4(1):21–41.