Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2020 Dec 15;9(12):giaa123.
doi: 10.1093/gigascience/giaa123.

Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore

Affiliations
Comparative Study

Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore

Dandan Lang et al. Gigascience. .

Abstract

Background: The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers-Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)-have recently updated their platforms: PacBio enables high-throughput HiFi reads with base-level resolution of >99%, and ONT generated reads as long as 2 Mb. We applied the 2 up-to-date platforms to a single rice individual and then compared the 2 assemblies to investigate the advantages and limitations of each.

Results: The results showed that ONT ultralong reads delivered higher contiguity, producing a total of 18 contigs of which 10 were assembled into a single chromosome compared to 394 contigs and 3 chromosome-level contigs for the PacBio assembly. The ONT ultralong reads also prevented assembly errors caused by long repetitive regions, for which we observed a total of 44 genes of false redundancies and 10 genes of false losses in the PacBio assembly, leading to over- or underestimation of the gene families in those long repetitive regions. We also noted that the PacBio HiFi reads generated assemblies with considerably fewer errors at the level of single nucleotides and small insertions and deletions than those of the ONT assembly, which generated an average 1.06 errors per kb and finally engendered 1,475 incorrect gene annotations via altered or truncated protein predictions.

Conclusions: It shows that both PacBio HiFi reads and ONT ultralong reads had their own merits. Further genome reference constructions could leverage both techniques to lessen the impact of assembly errors and subsequent annotation mistakes rooted in each.

Keywords: CCS; ONT ultralong; PacBio HiFi; assembly comparison; contiguity; single-molecular sequencer.

PubMed Disclaimer

Conflict of interest statement

D.L., P.R., F.L., Z.S,, G.M., Y.T., X.L., Q.L, L.H., D.W. and S.L. are employees of Grandomics Biosciences, a company that provides bioinformatics and genomics services.

Figures

Figure 1:
Figure 1:
Contiguity of the ONT and PacBio assemblies. (a) Treemaps for contig length difference between the ONT (left) and PacBio (right) assembly; (b) the 6 PacBio contigs mapped to 1 ONT contig corresponding to Chr. 6; (c) details of the 3 PacBio gaps. Red rectangles indicate repeat elements.
Figure 2:
Figure 2:
Assembly errors in which genes can be annotated. (a) An example shows gene gains caused by assembly redundancies, of which PB-R1 and PB-R2 had a similarity level of 99.67% and 99.51%, respectively, compared with the corresponding region on PB-L2. D: depth. (b) The gene redundancies caused by gaps that failed to be correctly connected by the PacBio assembly. (c) An example shows how a 1-base deletion led to a frameshift mistake for protein translation. (d) An example shows how a single-base error led to stop codon gain and truncated protein translation.
Figure 3:
Figure 3:
Assembly comparisons using the same methods. Left: number of contigs that were mapped onto Chr. 6; right: number of mismatches (including SNVs and InDels) per 100 kb.

References

    1. Weischenfeldt J, Symmons O, Spitz F, et al. Phenotypic impact of genomic structural variation: insights from and for human disease. Nat Rev Genet. 2013;14:125–38. - PubMed
    1. Fujimoto A, Furuta M, Totoki Y, et al. Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer. Nat Genet. 2016;48:500. - PubMed
    1. Saxena RK, Edwards D, Varshney RK. Structural variations in plant genomes. Brief Funct Genomics. 2014;13:296–307. - PMC - PubMed
    1. Chen YH, Gols R, Benrey B. Crop domestication and its impact on naturally selected trophic interactions. Annu Rev Entomol. 2015;60:35–58. - PubMed
    1. Wheeler DA, Srinivasan M, Egholm M, et al. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008;452:872–6. - PubMed

Publication types