Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul;9(7):1760-9.
doi: 10.1038/nprot.2014.118. Epub 2014 Jun 26.

Library preparation for highly accurate population sequencing of RNA viruses

Affiliations

Library preparation for highly accurate population sequencing of RNA viruses

Ashley Acevedo et al. Nat Protoc. 2014 Jul.

Abstract

Circular resequencing (CirSeq) is a novel technique for efficient and highly accurate next-generation sequencing (NGS) of RNA virus populations. The foundation of this approach is the circularization of fragmented viral RNAs, which are then redundantly encoded into tandem repeats by 'rolling-circle' reverse transcription. When sequenced, the redundant copies within each read are aligned to derive a consensus sequence of their initial RNA template. This process yields sequencing data with error rates far below the variant frequencies observed for RNA viruses, facilitating ultra-rare variant detection and accurate measurement of low-frequency variants. Although library preparation takes ∼5 d, the high-quality data generated by CirSeq simplifies downstream data analysis, making this approach substantially more tractable for experimentalists.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTEREST

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Schematic of CirSeq. True genetic variants are represented as orange and green circles. Other colors represent enzymatic and sequencing errors. (Steps 1–18) Full-length viral genomic RNA is processed into short (85–100 nt) circular RNAs. No mutations are introduced during this process. (Steps 19–24) Rolling-circle reverse transcription yields tandem copies of the circular RNA template. Reverse transcriptase introduces nontemplated mutations into the tandem copies. (Steps 25–53) Tandem-copy cDNAs are cloned to generate a library of dsDNA molecules containing sequencing platform–specific adapter sequences. Additional nontemplated mutations are accumulated by enzymatic error during cloning. (Analysis) Sequenced reads are computationally processed using an algorithm that identifies and aligns tandem repeats within each sequencing read. A consensus of the aligned reads, which excludes sequencing and enzymatic errors accumulated in this process, can be used for experiment-specific analysis.
Figure 2
Figure 2
Analysis of coverage from libraries produced with different-sized RNA fragments. Blue, black and orange points denote the coverage depth at each genome position for 30 nt, 90 nt and partially degraded fragments, as shown in Figure 4b, respectively. Short (30 nt) and partially degraded RNA fragments reduce the uniformity of coverage as compared with longer (90 nt) RNA fragments. 30- and 90-nt coverage data were obtained from Acevedo et al..
Figure 3
Figure 3
Analysis of variant frequency error. (a,b) Correlation of two sets of technical replicates with 10 million reads each is plotted, with color representing levels of measurement error (a) estimated using a binomial model or total coverage (b) observed at the genome position corresponding to each variant. Estimation of error using a binomial model accurately corresponds to the extent of correlation observed for variant frequencies in technical replicates. This error model is a function of the variant frequency and the coverage depth obtained for each position. Data were obtained from Acevedo et al..
Figure 4
Figure 4
Bioanalysis of size-selected fragmented RNA. (a,b) Digital gels (left) and fluorescence traces (right) of typical (a) and poor (b) purifications of fragmented RNA analyzed using a Bioanalyzer 2100. Size-selected RNA should migrate in a tight band with an average size of no less than 85 nt. Degradation of size-selected RNA fragments below this range (b) can result in poor yield of tandem repeat cDNA, thus reducing the number of unique molecules in the library, and it can distort coverage depth across the viral genome (Fig. 2).

References

    1. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26:1135–1145. - PubMed
    1. Dohm JC, Lottaz C, Borodina T, Himmelbauer H. Substantial biases in ultra-short-read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 2008;36:e105. - PMC - PubMed
    1. Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J. Parallel, tag-directed assembly of locally derived short sequence reads. Nat Methods. 2010;7:119–122. - PMC - PubMed
    1. Schmitt MW, et al. Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci USA. 2012;109:14508–14513. - PMC - PubMed
    1. Jabara CB, Jones CD, Roach J, Anderson JA, Swanstrom R. Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID. Proc Natl Acad Sci USA. 2011;108:20166–20171. - PMC - PubMed

Publication types