Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 3;6(1):302.
doi: 10.1038/s41597-019-0311-3.

Hybrid de novo whole-genome assembly and annotation of the model tapeworm Hymenolepis diminuta

Affiliations

Hybrid de novo whole-genome assembly and annotation of the model tapeworm Hymenolepis diminuta

Robert M Nowak et al. Sci Data. .

Erratum in

Abstract

Despite the use of Hymenolepis diminuta as a model organism in experimental parasitology, a full genome description has not yet been published. Here we present a hybrid de novo genome assembly based on complementary sequencing technologies and methods. The combination of Illumina paired-end, Illumina mate-pair and Oxford Nanopore Technology reads greatly improved the assembly of the H. diminuta genome. Our results indicate that the hybrid sequencing approach is the method of choice for obtaining high-quality data. The final genome assembly is 177 Mbp with contig N50 size of 75 kbp and a scaffold N50 size of 2.3 Mbp. We obtained one of the most complete cestode genome assemblies and annotated 15,169 potential protein-coding genes. The obtained data may help explain cestode gene function and better clarify the evolution of its gene families, and thus the adaptive features evolved during millennia of co-evolution with their hosts.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
MP dataset after NxTrim trimming insert size histogram. The graphs on the left and right present the histograms for MP1 and MP2 datasets, respectively.
Fig. 2
Fig. 2
Raw ONT dataset length histogram. The graphs on the left and right present the histograms for ONT1 and ONT2 datasets, respectively.
Fig. 3
Fig. 3
Raw ONT dataset quality diagrams. The graphs on the left and right present the diagrams for ONT1 and ONT2 datasets, respectively.
Fig. 4
Fig. 4
Results obtained by GenomeScope application. Shortcuts on the diagram: len – inferred total genome length, uniq – percent of the genome that is unique (not repetitive), het – overall rate of heterozygosity, kcov – mean k-mer coverage for heterozygous bases, err – error rate of the reads, dup – average rate of read duplications, k – k-mer size, observed – the observed k-mer profile, full model – estimated GenomeScope model, unique sequence – line representing unique sequences (k-mers below the line are treated as unique), errors – line representing sequencing errors (k-mers below the line are treated as incorrect), k-mer peaks – increased number of k-mers compared to the number of k-mers with lower and higher coverage.
Fig. 5
Fig. 5
Results obtained by the Circoletto application. The presented diagram compares the HMN_01_pilon sequence (subsequence from 18 Mbp to 24 Mbp indices) from the Hymenolepis microstoma genome (from WormBase ParaSite) to two scaffolds from the presented study: scaffold26 and scaffold28. Colors mean identity level: blue ≤ 0.25, green ≤ 0.50, orange ≤ 0.75, red > 0.75.
Fig. 6
Fig. 6
The organization of mitochondrial genome of Hymenolepis diminuta (WMS-il1 strain). All genes are transcribed in the same direction. The two leucine tRNA genes are designated by tRNA-LeuCUN and tRNA-LeuUUR, respectively, and two serine tRNA genes by tRNA-SerUCN and tRNA-SerAGN, respectively. Gene scaling is only approximate.
Fig. 7
Fig. 7
The results of bidirectional BLAST of predicted protein coding genes (proteins) against four reference proteomes. (a) The distribution of the de novo assembled protein coding sequences across four closely related cestode species. (b) The Venn diagram of 15,169 predicted proteins. The four included cestode species shared a core set of 5,416 proteins, a total of 8,543 proteins were included with reference to the H. diminuta proteome and 1,152 were unique for this tapeworm across all analyzed species.
Fig. 8
Fig. 8
The schematic diagram showing the types of improvements in the annotation of the H. diminuta genome. (a) Additions to the UTR annotations; (b) improvement of the CDS regions; (c) new gene annotations; (d) merging of two reference annotations. More detailed diagram, including examples of improvements, is presented in the Supplementary Figure (A–D).

Similar articles

Cited by

References

    1. Sun, T. Parasitic disorders: Pathology, diagnosis, and management. (Williams & Wilkins, 1999).
    1. Garcia, L. S. Diagnostic medical parasitology. (American Society for Microbiology Press, 2006).
    1. Kapczuk P, et al. Selected molecular mechanisms involved in the parasite–host system Hymenolepis diminuta–rattus norvegicus. Int. J. Mol. Sci. 2018;19:2435. - PMC - PubMed
    1. Skrzycki M, et al. Hymenolepis diminuta: experimental studies on the antioxidant system with short and long term infection periods in the rats. Exp. Parasitol. 2011;129:158–163. - PubMed
    1. Stradowski, M. Effects of inbreeding in Hymenolepis diminuta [Cestoda]. Acta Parasitol. 3, 146–149 (1994).

Publication types