Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 1;9(4):giaa029.
doi: 10.1093/gigascience/giaa029.

Continuous chromosome-scale haplotypes assembled from a single interspecies F1 hybrid of yak and cattle

Affiliations

Continuous chromosome-scale haplotypes assembled from a single interspecies F1 hybrid of yak and cattle

Edward S Rice et al. Gigascience. .

Abstract

Background: The development of trio binning as an approach for assembling diploid genomes has enabled the creation of fully haplotype-resolved reference genomes. Unlike other methods of assembly for diploid genomes, this approach is enhanced, rather than hindered, by the heterozygosity of the individual sequenced. To maximize heterozygosity and simultaneously assemble reference genomes for 2 species, we applied trio binning to an interspecies F1 hybrid of yak (Bos grunniens) and cattle (Bos taurus), 2 species that diverged nearly 5 million years ago. The genomes of both of these species are composed of acrocentric autosomes.

Results: We produced the most continuous haplotype-resolved assemblies for a diploid animal yet reported. Both the maternal (yak) and paternal (cattle) assemblies have the largest 2 chromosomes in single haplotigs, and more than one-third of the autosomes similarly lack gaps. The maximum length haplotig produced was 153 Mb without any scaffolding or gap-filling steps and represents the longest haplotig reported for any species. The assemblies are also more complete and accurate than those reported for most other vertebrates, with 97% of mammalian universal single-copy orthologs present.

Conclusions: The high heterozygosity inherent to interspecies crosses maximizes the effectiveness of the trio binning method. The interspecies trio binning approach we describe is likely to provide the highest-quality assemblies for any pair of species that can interbreed to produce hybrid offspring that develop to sufficient cell numbers for DNA extraction.

Keywords: Bos grunniens; Bos taurus; Highland cattle; genome assembly; phasing.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Trio binning of a yak/cattle hybrid. (a–c) We collected short reads from a yak cow and a Highland cattle bull, and long reads from their F1 hybrid offspring. (d) Counts of 21-mers shared by Molly and Duke and those unique to a single parent. (e) Long-read coverage of the maternal and paternal haplotypes after binning reads from Esperanza using 21-mers from (d). (f–g) Ideograms of contigs on chromosomes for (f) ARS-UCD1.2, (g) Esperanza's maternal (yak) haplotype assembly, and (h) Esperanza's paternal (cattle) haplotype assembly, with contigs represented as solid blocks of a single color and full chromosome arms in single contigs noted with an asterisk.
Figure 2:
Figure 2:
Alignment of 6 cattle and 6 yaks to chr29 of our (a) maternal and (b) paternal assemblies shows that the maternal haplotype assembly is more similar to yak genomes than cattle and the paternal haplotype assembly is more similar to cattle genomes, demonstrating that they are phased correctly.
Figure 3:
Figure 3:
Comparison of trio Highland cattle and yak assemblies to current cattle, chicken, goat, and human reference assemblies, based on ratio of largest contig size to largest chromosome arm size (a), ratio of contig N50 to chromosome arm N50 (b), and number of gaps in autosomes and the major sex chromosome, i.e., X in cattle, yak, goat, and human and Z in chicken (c). We note that the number of gaps in hg38 is somewhat inflated owing to its gapped assembly of centromeres.

References

    1. Rice ES, Green RE. New approaches for genome assembly and scaffolding. Annu Rev Anim Biosci. 2019;7:17–40. - PubMed
    1. Alkan C, Sajjadian S, Eichler EE. Limitations of next-generation genome sequence assembly. Nat Methods. 2011;8:61–5. - PMC - PubMed
    1. Treangen TJ, Salzberg SL. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet. 2012;13:36–46. - PMC - PubMed
    1. Ardui S, Ameur A, Vermeesch JR, et al.. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids Res. 2018;46:2159–68. - PMC - PubMed
    1. Payne A, Holmes N, Rakyan V, et al.. BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files. Bioinformatics. 2019;35(13):2193–8. - PMC - PubMed

Publication types