Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 1;29(6):dsac035.
doi: 10.1093/dnares/dsac035.

A high-quality, haplotype-phased genome reconstruction reveals unexpected haplotype diversity in a pearl oyster

Affiliations

A high-quality, haplotype-phased genome reconstruction reveals unexpected haplotype diversity in a pearl oyster

Takeshi Takeuchi et al. DNA Res. .

Abstract

Homologous chromosomes in the diploid genome are thought to contain equivalent genetic information, but this common concept has not been fully verified in animal genomes with high heterozygosity. Here we report a near-complete, haplotype-phased, genome assembly of the pearl oyster, Pinctada fucata, using hi-fidelity (HiFi) long reads and chromosome conformation capture data. This assembly includes 14 pairs of long scaffolds (>38 Mb) corresponding to chromosomes (2n = 28). The accuracy of the assembly, as measured by an analysis of k-mers, is estimated to be 99.99997%. Moreover, the haplotypes contain 95.2% and 95.9%, respectively, complete and single-copy BUSCO genes, demonstrating the high quality of the assembly. Transposons comprise 53.3% of the assembly and are a major contributor to structural variations. Despite overall collinearity between haplotypes, one of the chromosomal scaffolds contains megabase-scale non-syntenic regions, which necessarily have never been detected and resolved in conventional haplotype-merged assemblies. These regions encode expanded gene families of NACHT, DZIP3/hRUL138-like HEPN, and immunoglobulin domains, multiplying the immunity gene repertoire, which we hypothesize is important for the innate immune capability of pearl oysters. The pearl oyster genome provides insight into remarkable haplotype diversity in animals.

Keywords: Mollusca; aquaculture; haplotype-phased genome assembly; immunity; pearl oyster.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A haplotype-phased genome assembly of Pinctada fucata. (A) The sequencing and assembly pipeline to produce the haplotype-phased AI genome assembly. (B) An N(x) plot of four P. fucata genome assembly versions shows their relative contiguity. Versions 1.0 and 2.0 correspond to those reported in Takeuchi et al. (2012) and Takeuchi et al. (2016), respectively. The AI (haplotype-phased) and MK (haplotype-merged) assemblies are reported in this study. (C) A whole genome contact map showing 28 clusters representing chromosomal scaffolds. The colour scale is based on the relative interaction value from highest (1, red) to lowest value (<0.001, white).
Figure 2.
Figure 2.
Collinearity between the 14 chromosomal scaffold pairs of haplotype A and B characterized by syntenic gene arrangement, gene density, GC%, and density of transposable elements including LTR, LINE, SINE, Helitron, and DNA transposons. Black arrowheads indicate putative centromere regions characterized by high GC% and the absence of protein-coding genes. Red arrowheads indicate examples of gap positions where repetitive sequences were missing in the scaffolds. See also supplementary Fig. S4.
Figure 3.
Figure 3.
The recent expansion of transposable elements (TEs) shapes the pearl oyster genome. (A) A pie chart summarizes the TE content of the P. fucata genome assembly. A histogram shows the distribution of Kimura substitution levels among the TEs. The Kimura substitution level (%) for each copy compared with its consensus sequence is used as a proxy for the expansion history of the TEs. In general, TEs show low substitution levels, indicating their recent expansion. (B) Examples of domain architectures of TE-related gene products found in the Iso-Seq data. (C) The size distribution of insertions in haplotype A. TE copy insertions of similar size indicate that they are less degraded under neutral selection and that they have been inserted recently.
Figure 4.
Figure 4.
Non-syntenic regions in scaffold 9 contain an expanded repertoire of innate immunity genes. (A) Pairwise alignment of haplotypes A and B of scaffold 9. Aligned segments are represented as red (forward alignment) or blue (reverse alignment) dots. (B) Chromosomal positions of NACHT (PF05729) and DZIP3/hRUL138-like HEPN (PF18738) domain-containing protein (DCP) genes. Loci of NACHT DCP genes (red arrowheads), DZIP3/hRUL138-like HEPN DCP genes (blue arrowheads), and genes encoding both domains (purple arrowheads) are enriched in non-syntenic regions of scaffold 9. (C–H) The gene order of HEPN and DZIP3/hRUL138-like HEPN DCP genes is marked by grey lines in panel (B). Gene numbers are shown in white boxes if more than 9 genes intercalate in the tandem array of NACHT and DZIP3/hRUL138-like HEPN DCP genes. (I) Approximate numbers of NACHT and DZIP3/hRUL138-like DCP genes in Protostome species. In contrast to Ecdysozoa, there is wide copy gene number variation in Lophotorochozoa, ranging from 0 (Schistosoma mansoni) to 218 (P. fucata). The lineage-specific expansion of NACHT DCP genes in P. fucata is evident because the copy number is typically 5–20 in other molluscan species. Abbreviations of species names: Pfu, Pinctada fucata; Cgi, Crassostrea gigas; Bpl, Bathymodiolus platifrons; Mph, Modiolus philippinarum; Mye, Mizuhopecten yessoensis; Pma, Pecten maximus; Lgi, Lottia gigantea; Hdi, Haliotis discus; Gae, Gigantopelta aegis; Pca, Pomacea canaliculata; Aca, Aplysia californica; Bgl, Biomphalaria glabrata; Obi, Octopus bimaculoides; Osi, Octopus sinensis; Sph, Sepia pharaonis; Nge, Notospermus geniculatus; Pau, Phoronis australis; Lan, Lingula anatina; Cte, Capitella teleta; Hro, Helobdella robusta; Sma, Schistosoma mansoni; Sme, Schmidtea mediterranea; Cel, Caenorhabditis elegans; Dme, Drosophila melanogaster; Tca, Tribolium castaneum. (J–N) Examples of the diverse domain architectures of NACHT DCPs. The number of amino acids is shown at the right. (J, K) P. fucata NACHT DCPs with the typical tripartite domain architecture of NLRs, including a C-terminal ligand-sensing leucine-rich repeat (LRR) domain, a central nucleotide-binding NACHT domain, and an N-terminal effector domain. (O) Sequence logos for the consensus Rx4-6H motif in DZIP3/hRUL138-like HEPN domains. In the diploid P. fucata genome, 194 of 202 DZIP3/hRUL138-like HEPN DCPs include the typical Rx4-6H motif. Rx4H was the most common motif, found in 131 proteins. The presence of typical Rx4-6H motifs indicates RNase activity of the protein.
Figure 5.
Figure 5.
Reduced heterozygosity after successive inbreeding of the MK line. To calculate the heterozygosity rate, the number of SNPs per 10-kb, non-overlapping window was counted. (A) The heterozygosity rate in each chromosomal scaffold. Box plots for the original individual of inbreeding are shown in green, those of the third-generation individual in purple. (B) The extremely reduced heterozygosity in third-generation individual is exemplified by scaffolds 9 and 13. Megabase-scale ROH regions were observed from 49 to 58 Mb in scaffold 9 and from 0 to 18 Mb in scaffold 13, respectively, possibly due to autozygosity by inbreeding.

Similar articles

Cited by

References

    1. Romiguier, J., Gayral, P., Ballenghien, M., et al. . 2014, Comparative population genomics in animals uncovers the determinants of genetic diversity, Nature, 515, 261–3. - PubMed
    1. Ellegren, H., and Galtier, N.. 2016, Determinants of genetic diversity, Nat. Rev. Genet., 17, 422–33. - PubMed
    1. Liu, C., Zhang, Y., Ren, Y., et al. . 2018, The genome of the golden apple snail Pomacea canaliculata provides insight into stress tolerance and invasive adaptation, GigaScience, 7, 1–13. - PMC - PubMed
    1. Bai, C.-M., Xin, L.-S., Rosani, U., et al. . 2019, Chromosomal-level assembly of the blood clam, Scapharca (Anadara) broughtonii, using long sequence reads and Hi-C, GigaScience, 8, 1–8. - PMC - PubMed
    1. Guo, Y., Zhang, Y., Liu, Q., et al. . 2019, A chromosomal-level genome assembly for the giant African snail Achatina fulica, GigaScience, 8, 1–8. - PMC - PubMed