Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 1;7(5):giy037.
doi: 10.1093/gigascience/giy037.

Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning

Affiliations

Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning

Haotian Teng et al. Gigascience. .

Erratum in

Abstract

Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
(A) An unrolled sketch of the NN architecture. The circles at the bottom represent the time series of raw signal input data. Local pattern information is then discriminated from this input by a CNN. The output of the CNN is then fed into an RNN to discern the long-range interaction information. A FC layer is used to get the base probability from the output of the RNN. These probabilities are then used by a CTC decoder to create the nucleotide sequence. The repeated component is omitted. (B) Final architecture of the Chiron model. Variants of this architecture were explored by varying the number of convolutional layers from 3 to 10 and recurrent layers from 3 to 5. We also explored networks with only convolutional layers or recurrent layers, 1×3 conv, 256, no bias means a convolution operation with a 1×3 filter and a 256-channeloutput with no bias added. LTSM = long-term short memory.
Figure 2:
Figure 2:
Visualization of the predicted probability of bases and the readout sequence. The upper panel is a normalized raw signal from the MinION nanopore sequencer, normalized by subtracting the mean of the whole signal and then dividing by the standard deviation. The bottom panel shows the predicted probability of each base at each position from Chiron. The final output DNA sequence is annotated on the x-axis of the bottom plane.
Figure 3:
Figure 3:
(A) Assembly error rate (%) for each polishing round using Racon. Two individually sequenced E. coli samples are included (S10, S18). All basecallers have a similar performance on the M. tuberculosis dataset due to its high sequencing depth (130X). (B) Relative assembly length (%) after each round of polishing. Relative length is defined as the length of the assembly divided by the length of reference genome.

Similar articles

Cited by

References

    1. Kasianowicz JJ, Brandin E, Branton D et al. .. Characterization of individual polynucleotide molecules using a membrane channel. Proc Nat Acad of Sci. 1996;93(24):13770–3. - PMC - PubMed
    1. Branton D, Deamer DW, Marziali A, et al. .. The potential and challenges of nanopore sequencing. Nature Biotechnology. 2008;26(10):1146–53. - PMC - PubMed
    1. Stoddart D, Heron AJ, Mikhailova E et al. .. Single-nucleotide discrimination in immobilized DNA oligonucleotides with a biological nanopore. Proc Nat Acad of Sci U S A. 2009;106(19):7702–7. - PMC - PubMed
    1. Ashton PM, Nair S, Dallman T, et al. .. MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island. Nature Biotechnology. 2014;33(3):296–300. - PubMed
    1. Cao MD, Ganesamoorthy D, Elliott AG, et al. .. Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinIONTM sequencing. GigaScience. 2016;5(1):32,10.1186/s13742-016-0137-2. - DOI - PMC - PubMed

Publication types