Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 24:21:2352-2364.
doi: 10.1016/j.csbj.2023.03.038. eCollection 2023.

Benchmarking of Nanopore R10.4 and R9.4.1 flow cells in single-cell whole-genome amplification and whole-genome shotgun sequencing

Affiliations

Benchmarking of Nanopore R10.4 and R9.4.1 flow cells in single-cell whole-genome amplification and whole-genome shotgun sequencing

Ying Ni et al. Comput Struct Biotechnol J. .

Abstract

Third-generation sequencing can be used in human cancer genomics and epigenomic research. Oxford Nanopore Technologies (ONT) recently released R10.4 flow cell, which claimed an improved read accuracy compared to R9.4.1 flow cell. To evaluate the benefits and defects of R10.4 flow cell for cancer cell profiling on MinION devices, we used the human non-small-cell lung-carcinoma cell line HCC78 to construct libraries for both single-cell whole-genome amplification (scWGA) and whole-genome shotgun sequencing. The R10.4 and R9.4.1 reads were benchmarked in terms of read accuracy, variant detection, modification calling, genome recovery rate and compared with the next generation sequencing (NGS) reads. The results highlighted that the R10.4 outperforms R9.4.1 reads, achieving a higher modal read accuracy of over 99.1%, superior variation detection, lower false-discovery rate (FDR) in methylation calling, and comparable genome recovery rate. To achieve high yields scWGA sequencing in the ONT platform as NGS, we recommended multiple displacement amplification with a modified T7 endonuclease Ⅰ cutting procedure as a promising method. In addition, we provided a possible solution to filter the likely false positive sites among the whole genome region with R10.4 by using scWGA sequencing result as a negative control. Our study is the first benchmark of whole genome single-cell sequencing using ONT R10.4 and R9.4.1 MinION flow cells by clarifying the capacity of genomic and epigenomic profiling within a single flow cell. A promising method for scWGA sequencing together with the methylation calling results can benefit researchers who work on cancer cell genomic and epigenomic profiling using third-generation sequencing.

Keywords: Long read; Methylation; Nanopore DNA sequencing; Single-cell whole genome amplification sequencing; Whole genome shotgun sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there is no conflict of interest associated with this study.

Figures

ga1
Graphical abstract
Fig. 1
Fig. 1
The quality of nanopore read from whole-genome shotgun (WGS) and single-cell whole-genome amplification (WGA) sequencing using R9.4.1 and R10.4 flow cells. A Both R10.4 reads from WGS and scWGA sequencing libraries outperformed the R9.4.1 reads in terms of original Guppy basecaller estimated (grey) and read mapping observed (white) accuracy with medium and quartiles. The dispersion of the boxplot also shows that observed read accuracy is higher than estimated ones accordingly. B The density distribution plot indicates the R10.4 reads had higher modal read accuracy than R9.4.1 both in estimated (top) and observed (bottom) ones. C R10.4 had a higher average accuracy detection rate on homopolymers ranging from 4 to 9 bp than R9.4.1 both for WGS and scWGA libraries. It can also identify the preference for adenine (A) and thymine (T) over cytosine (C) and guanine (G) in homopolymer detection of nanopore reads.
Fig. 2
Fig. 2
DNA methylation detected in WGS sequencing reads from R9.4.1 and R10.4 flow cells. The 5-methylcytosine (5mC) level between R10.4 and R9.4.1 data showed high consistency on A whole genome level with 500 kb window and B promoter level. C The distribution of 5mC proportion in whole genome level divided by 500 kb bins and promoter regions detected from R10.4 and R9.4.1 reads. D The distribution of 5mC across the 3000 bp before and after the transcription start sites (TSSs) of the associated genes and E gene bodies.
Fig. 3
Fig. 3
Methylation profiling of 5mC using R10.4 and R9.4.1 reads and false positive site filtering in mitochondrial region. A The density plot and B boxplot showing the distribution of methylation CpG (meCpG) proportion in each read. C IGV snapshot for a 500 bp region with reads carrying predicted methylation CpG sites. The methylated CpG is highlighted in red. Please note that the scWGA libraries should be free of methylation and the positive sites are likely to be false positives.
Fig. 4
Fig. 4
Representative Copy number variation (CNV), structural variation (SV) and single nucleotide variation (SNV) patterns detected from HCC78 using WGS and single-cell WGA sequencing reads from R9.4.1 and R10.4 flow cells. A Genome-wide CNV distribution with 1 Mb bin size. B A zoom in of CNV gaining event on chromosome 14. C and D The intersection number (C) and proportion (D) of SV calling from R10.4 and R9.4.1 reads with minimum supported reads number (n) from 3 to 5. E The IGV snapshot showing reads spanning the breakpoint of ROS1 and SLC34A2 genes in the HCC78 cell line from four libraries. The DNA strand detached at 25,655,005 bp of chromosome 4 and got attached to the ROS1 gene at 117,337,144 bp of chromosome 6, while chromosome 6 also broke at 117,337,162 bp and jointed with the SCL34A2 chromosome 4 at 25,665,008 bp. The unaligned nucleotides are color-coded (A: green, T: red, C: blue, G: gold) while the aligned ones are colored in grey. F Schematic of the translocation between SLC34A2 (top) and ROS1 (bottom) genes. Breakpoints were detected from DNA sequence between ROS1 exon 32 and 33, and between SLC34A2 exon 4 and 5 accordingly from both WGS and scWGA data, which would result in the RNA fusion of ROS1-SLC34A2. G The number of somatic point mutations and indels of HCC78 cell line recorded in DepMap database together with the intersection record number in four libraries. H The Venn diagram of mutations intersected with records from DepMap database among four libraries suggest the considerable reproductivity of R10.4 and R9.4.1 data in SNV detection. WGS, whole-genome shotgun; scWGA, single-cell whole-genome amplification.
Fig. 5
Fig. 5
Genome recovery rate and repeat types of uncovered region from different sequencing methods. A The genome recovery rate of HCC78 DNA sequencing using different sequencing methods. One R9.4.1 flow cell can achieve a higher genome recovery rate with more data yield than R10.4 in both WGS and WGA sequencing. Single cell genome recovery rate increased with the extension of T7 endonuclease Ⅰ incubation time for MDA products. Each grey dot indicates a sequencing library with a different treatment, while the colored lines and dots indicate subsampled reads from WGS or scWGA sequencing. B The sum length for uncovered genomic regions which falls into different repeat types in five libraries. C The sum length of uncovered repeat regions and the proportion (upper right) of repetitive sequence in uncovered regions with the same read coverage. All libraries were subsampled into the same yield as WGS R10.4 data. MDA, multiple displacement amplification. MALBAC, multiple annealing and looping based amplification cycles. T7E1, T7 endonuclease Ⅰ.

Similar articles

Cited by

References

    1. Souche E., Beltran S., Brosens E., Belmont J.W., Fossum M., et al. Recommendations for whole genome sequencing in diagnostics for rare diseases. Eur J Hum Genet. 2022;16 doi: 10.1038/s41431-022-01113-x. (vol) - DOI - PMC - PubMed
    1. Smedley D., Smith K.R., Martin A., Thomas E.A., McDonagh E.M., et al. 100,000 Genomes pilot on rare-disease diagnosis in health care - preliminary report. N Engl J Med. 2021;vol. 385(20):1868–1880. Nov 11. - PMC - PubMed
    1. Berger M.F., Mardis E.R. The emerging clinical relevance of genomics in cancer medicine. Nat Rev Clin Oncol. 2018;vol. 15(6):353–365. (Jun) - PMC - PubMed
    1. Alkhateeb A., Rueda L. Zseq: an approach for preprocessing next-generation sequencing data. J Comput Biol. 2017;vol. 24(8):746–755. (Aug) - PMC - PubMed
    1. Schadt E.E., Turner S., Kasarskis A. A window into third-generation sequencing. Hum Mol Genet. 2010;vol. 19(R2):R227–R240. Oct 15. - PubMed

LinkOut - more resources