Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Sep 26;8(1):14408.
doi: 10.1038/s41598-018-32615-8.

Direct RNA Sequencing of the Coding Complete Influenza A Virus Genome

Affiliations

Direct RNA Sequencing of the Coding Complete Influenza A Virus Genome

Matthew W Keller et al. Sci Rep. .

Erratum in

Abstract

For the first time, a coding complete genome of an RNA virus has been sequenced in its original form. Previously, RNA was sequenced by the chemical degradation of radiolabeled RNA, a difficult method that produced only short sequences. Instead, RNA has usually been sequenced indirectly by copying it into cDNA, which is often amplified to dsDNA by PCR and subsequently analyzed using a variety of DNA sequencing methods. We designed an adapter to short highly conserved termini of the influenza A virus genome to target the (-) sense RNA into a protein nanopore on the Oxford Nanopore MinION sequencing platform. Utilizing this method with total RNA extracted from the allantoic fluid of influenza rA/Puerto Rico/8/1934 (H1N1) virus infected chicken eggs (EID50 6.8 × 109), we demonstrate successful sequencing of the coding complete influenza A virus genome with 100% nucleotide coverage, 99% consensus identity, and 99% of reads mapped to influenza A virus. By utilizing the same methodology one can redesign the adapter in order to expand the targets to include viral mRNA and (+) sense cRNA, which are essential to the viral life cycle, or other pathogens. This approach also has the potential to identify and quantify splice variants and base modifications, which are not practically measurable with current methods.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
(A) Influenza A viruses contain highly conserved 12 and 13 nt sequences at the 3′ and 5′ termini. (B) The key component of Oxford Nanopore direct RNA sequencing is a Reverse Transcriptase Adapter (RTA) which targets poly(A) mRNA and is ligated to the 3′ end of the mRNA. A sequencing adapter is then ligated to the RTA which directs the RNA strand into the pore for sequencing. (C) The RTA was modified to target the 3′ conserved 12 nt of the influenza A virus genome. (D) The modified RTA hybridizes and is ligated to vRNA in the first step of direct RNA sequencing.
Figure 2
Figure 2
MinION direct RNA and MiSeq M-RTPCR sequencing covered the coding regions of the PB2, PB1, PA, HA, NP, NA, M, and NS genome segments of the influenza A virus genome from the influenza rA/Puerto Rico/8/1934 (H1N1) crude viral samples to an average depth of 2,789 and 1,478 respectively. Negative-sense slope coverages in the MinION results confirm the directionality of the sequencing and capture method.
Figure 3
Figure 3
The extreme 3′ termini (Uni-12) of all segments were fully sequenced and matched the expected sequence with the exception of the degeneracy at the +4 position which was not resolved. The sequences for the extreme 5′ termini (Uni-13) that were obtained match the expected sequences with the exception of a C to G substitution at the −9 position in the segments PB1 and PB2. The loss of coverage at the extreme 5′ end of the molecule is most likely due to unreliable processivity as the last of the molecule passes and resulted in the final nine nucleotides not being sequenced in some of the segments. These missing bases in the extreme termini represent the difference between a coding complete genome, which is claimed here, and a complete genome.
Figure 4
Figure 4
The aligned read length distributions correspond to the expected lengths (dashed lines) of the respective segments (NS 890 nt; M 1,027 nt; NA 1,413 nt; NP 1,565 nt; HA 1,778 nt; PA 2,233 nt; PB1 and PB2 2,341 nt) from the influenza rA/Puerto Rico/8/1934 (H1N1) crude viral samples. As the segment length increases, the read length distribution falls further short of the expected length, presumably due to RNA degradation. Aligned read lengths include insertion errors, accounting for the presence of reads larger than the expected value. Due to cases of large insertion errors, 14 total reads longer than 2,500 nucleotides were observed.

References

    1. Peattie DA. Direct chemical method for sequencing RNA. Proc Natl Acad Sci USA. 1979;76:1760–1764. doi: 10.1073/pnas.76.4.1760. - DOI - PMC - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63. doi: 10.1038/nrg2484. - DOI - PMC - PubMed
    1. Cocquet J, Chong A, Zhang G, Veitia RA. Reverse transcriptase template switching and false alternative transcripts. Genomics. 2006;88:127–131. doi: 10.1016/j.ygeno.2005.12.013. - DOI - PubMed
    1. Roy SW, Irimia M. When good transcripts go bad: artifactual RT‐PCR good tra’and genome analysis. Bioessays. 2008;30:601–605. doi: 10.1002/bies.20749. - DOI - PubMed
    1. Haddad F, Qin AX, Giger JM, Guo H, Baldwin KM. Potential pitfalls in the accuracy of analysis of natural sense-antisense RNA pairs by reverse transcription-PCR. BMC Biotechnol. 2007;7:21. doi: 10.1186/1472-6750-7-21. - DOI - PMC - PubMed

Publication types

LinkOut - more resources