Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 29;22(1):222.
doi: 10.1186/s12864-021-07512-6.

Telomere-to-telomere assembly of the genome of an individual Oikopleura dioica from Okinawa using Nanopore-based sequencing

Affiliations

Telomere-to-telomere assembly of the genome of an individual Oikopleura dioica from Okinawa using Nanopore-based sequencing

Aleksandra Bliznina et al. BMC Genomics. .

Abstract

Background: The larvacean Oikopleura dioica is an abundant tunicate plankton with the smallest (65-70 Mbp) non-parasitic, non-extremophile animal genome identified to date. Currently, there are two genomes available for the Bergen (OdB3) and Osaka (OSKA2016) O. dioica laboratory strains. Both assemblies have full genome coverage and high sequence accuracy. However, a chromosome-scale assembly has not yet been achieved.

Results: Here, we present a chromosome-scale genome assembly (OKI2018_I69) of the Okinawan O. dioica produced using long-read Nanopore and short-read Illumina sequencing data from a single male, combined with Hi-C chromosomal conformation capture data for scaffolding. The OKI2018_I69 assembly has a total length of 64.3 Mbp distributed among 19 scaffolds. 99% of the assembly is contained within five megabase-scale scaffolds. We found telomeres on both ends of the two largest scaffolds, which represent assemblies of two fully contiguous autosomal chromosomes. Each of the other three large scaffolds have telomeres at one end only and we propose that they correspond to sex chromosomes split into a pseudo-autosomal region and X-specific or Y-specific regions. Indeed, these five scaffolds mostly correspond to equivalent linkage groups in OdB3, suggesting overall agreement in chromosomal organization between the two populations. At a more detailed level, the OKI2018_I69 assembly possesses similar genomic features in gene content and repetitive elements reported for OdB3. The Hi-C map suggests few reciprocal interactions between chromosome arms. At the sequence level, multiple genomic features such as GC content and repetitive elements are distributed differently along the short and long arms of the same chromosome.

Conclusions: We show that a hybrid approach of integrating multiple sequencing technologies with chromosome conformation information results in an accurate de novo chromosome-scale assembly of O. dioica's highly polymorphic genome. This genome assembly opens up the possibility of cross-genome comparison between O. dioica populations, as well as of studies of chromosomal evolution in this lineage.

Keywords: Chromosome-scale assembly; Hi-C; Oikopleura dioica; Oxford Nanopore sequencing; Single individual; Telomere-to-telomere.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Genome assembly and annotation workflow used to generate the OKI2018_I69 genome assembly. a Life images of adult male (top) and female (bottom) O. dioica. b The assembly was generated using Nanopore and Illumina data, followed by scaffolding using Hi-C chromosomal capture information data
Fig. 2
Fig. 2
Quality control checks implemented on different steps of genome sequencing and assembly. a Graph showing length distribution of raw Nanopore reads used to generate the OKI2018_I69 assembly. b Estimated total and repetitive genome size based on k-mer counting of the Illumina paired-end reads used for polishing the OKI2018_I69 assembly. c Pairwise genome alignment of the contig assemblies of I69 and I28 O. dioica individuals
Fig. 3
Fig. 3
OKI2018_I69 assembly of the Okinawan O. dioica. a Treemap comparison between the contig (left) and scaffold (right) assemblies of the O. dioica genome. Each rectangle represents a contig or a scaffold in the assembly with the area proportional to its length. b Comparison between the OKI2018_I69 (left) and OdB3 (right) linkage groups. The Sankey plot shows what proportion of each chromosome in the OKI2018_I69 genome is aligned to the OdB3 linkage groups. c Contact matrix generated by aligning Hi-C data set to the OKI2018_I69 assembly with Juicer and 3D-DNA pipelines. Pixel intensity in the contact matrices indicates how often a pair of loci collocate in the nucleus
Fig. 4
Fig. 4
Chromosome-level features of the Okinawan O. dioica genome. a Visualization of sequence properties across chromosomes in the OKI2018_I69 assembly. For each chromosome, 50 kbp windows of GC (orange), Nanopore sequence coverage (blue), the percent of nucleotides masked by RepeatMasker (purple), and the number of genes (yellow) are indicated. Differences in these sequence properties occur near predicted sites of centromeres and telomeres, as well as between the short and long arms of each non-sex-specific chromosome. Telomeres and gaps in the assembly are indicated with black and grey rectangles, respectively. b Long and short chromosome arms exhibit significant differences sequence properties, including GC content, repetitive sequence content, and the number of restriction sites recognized by the DpnII enzyme used to generate the Hi-C library
Fig. 5
Fig. 5
Quality assessment of the OKI2018_I69 genome assembly. a Proportion of BUSCO genes detected or missed in Oikopleura genomes and transcriptomes. The search on the OKI2018_I69 assembly was repeated with default parameters (“no training”) to display the effect of AUGUSTUS training. b Venn diagram showing the number of BUSCO genes missing in OKI2018_I69, OdB3 and/or OSKA2016 genomes
Fig. 6
Fig. 6
Analysis of repetitive elements. The repeat landscape and proportions of various repeat classes in the genome are indicated and color-coded according to the classes shown on the right side of the figure. The non-repetitive fraction of the genome is shown in black
Fig. 7
Fig. 7
Draft scaffold of the mitochondrial genome in the OKI2018_I69 assembly. a Predicted gene annotation of the draft mitochondrial genome sequence. b Self-similarity plot of the draft mitochondrial genome sequence. A tandem repeat can be seen, which complicates the complete assembly of the mitochondrial genome from whole-genome sequencing data
Fig. 8
Fig. 8
Genomic locations of various oikopleurid gene homologs in the OKI2018_I69. The genes are searchable by name and PubMed identifiers in the ZENBU genome browser. Colours indicate genes from the same family

References

    1. Alldredge AL. Discarded appendicularian houses as sources of food, surface habitats, and particulate organic matter in planktonic environments. Limnol Oceanogr. 1976;21(1):14–24. doi: 10.4319/lo.1976.21.1.0014. - DOI
    1. Hopcroft RR, Roff JC. Zooplankton growth rates: extraordinary production by the larvacean Oikopleura dioica in tropical waters. J Plankton Res. 1995;17(2):205–220. doi: 10.1093/plankt/17.2.205. - DOI
    1. Sato R, Tanaka Y, Ishimaru T. House production by Oikopleura dioica (Tunicata, Appendicularia) under laboratory conditions. J Plankton Res. 2001;23(4):415–423. doi: 10.1093/plankt/23.4.415. - DOI
    1. Alldredge A. The contribution of discarded appendicularian houses to the flux of particulate organic carbon from oceanic surface waters. In: Gorsky G, Youngbluth MJ, Deibel D, editors. Response of Marine Ecosystems to Global Change: Ecological Impact of Appendicularians: Contemporaty Publishing International; 2005. p. 309–26.
    1. Masunaga A, Liu AW, Tan Y, Scott A, Luscombe NM. Streamlined sampling and cultivation of the pelagic cosmopolitan larvacean, Oikopleura dioica. JoVE (Journal of Visualized Experiments). 2020;16(160):e61279. - PubMed

LinkOut - more resources