Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Feb 1;56(2):61-4, 66, 68, passim.
doi: 10.2144/000114133. eCollection 2014.

Library construction for next-generation sequencing: overviews and challenges

Affiliations
Review

Library construction for next-generation sequencing: overviews and challenges

Steven R Head et al. Biotechniques. .

Abstract

High-throughput sequencing, also known as next-generation sequencing (NGS), has revolutionized genomic research. In recent years, NGS technology has steadily improved, with costs dropping and the number and range of sequencing applications increasing exponentially. Here, we examine the critical role of sequencing library quality and consider important challenges when preparing NGS libraries from DNA and RNA sources. Factors such as the quantity and physical characteristics of the RNA or DNA source material as well as the desired application (i.e., genome sequencing, targeted sequencing, RNA-seq, ChIP-seq, RIP-seq, and methylation) are addressed in the context of preparing high quality sequencing libraries. In addition, the current methods for preparing NGS libraries from single cells are also discussed.

Keywords: ChIP-seq; DNA; DNA-seq; RIP-seq; RNA; RNA-seq; deep sequencing; library preparation; next-generation sequencing.

PubMed Disclaimer

Conflict of interest statement

Competing Interests

The authors DRS, SRH, and PO are founding scientists and consultants to Transplant Genomics Inc.

Figures

Figure 1
Figure 1. Basic workflow for NGS library preparation
RNA or DNA is extracted from sample tissue/cells and fragmented. RNA is converted to cDNA by reverse transcription. DNA Fragments are converted into the library by ligation to sequencing adapters containing specific sequences designed to interact with the NGS platform, either the surface of the flow-cell (Illumina) or beads (Ion Torrent). The next step involves clonal amplification of the library, by either cluster generation for Illumina or microemulsion PCR for Ion Torrent. The final step generates the actual sequence via the chemistries for each technology. One difference between the two technologies is that Illumina allows sequencing from both ends of the library insert (i.e., paired end sequencing). Cell photograph courtesy of Annie Cavanagh, Wellcome Images.
Figure 2
Figure 2. DNA library preparation using a transposase-based method (Nextera) developed by Illumina
The transpososome complex comprises an engineered transposase pre-loaded with two double-stranded sequencing adapters. The transpososome simultaneously fragments the DNA and inserts the adapters. The full Illumina adapter sequences are completed during subsequent PCR cycling, after which the library is ready for quantitation and loading onto the flow cell.
Figure 3
Figure 3. Library preparation workflow for miRNA-seq
A) The Illumina workflow ligates a 3′ adenylated DNA adapter to the 3′ end of miRNA in a total RNA sample. Then, an RNA adapter is ligated to the 5′ end of the miRNA. The doubled-ligated products are RT-PCR amplified to introduce barcodes for multiplex applications and generate sequencing libraries. The first read sequences the insert miRNA; a second and separate sequencing read is necessary to sequence the barcode. B) Ion Torrent’s workflow uses an RNA ligase to attach 5′ and 3′ adapters composed of hybrid RNA-DNA duplexes. An RT-PCR reaction amplifies the sample and introduces the barcodes to the library construct. In this method, the barcode and the miRNA insert are sequenced in a single read.
Figure 4
Figure 4. Approaches for preparing RNA-seq libraries from single cells
A) Poly-adenylated RNA is reverse transcribed with an anchored oligo-dT primer carrying a universal primer sequence at its 5′ end. Next, poly-nucleotide tailing is used to add a poly(A) tail to the 3′ end of the cDNA. This cDNA can now be amplified with universal PCR primers containing an oligo-dT sequence at the 3′ end. Amplified cDNA can then be used in a standard DNA library construction protocol. B) An anchored oligo-dT primer initiates cDNA synthesis and adds a universal primer sequence. Next, the cDNA is polynucleotide tailed by the RT, producing a 3′ overhanging tail. Template switching is initiated on the 3′ end of the cDNA by hybridization of a second universal primer sequence containing complementary bases at its 3′ end. The template switching oligonucleotide is 3′ blocked (*) to prevent extension by the polymerase, whereas the 3′ end of the cDNA is extended to copy the second universal primer sequence onto the end of the cDNA. The cDNA can now be amplified by PCR. The PCR products created are then taken into a standard library protocol. C) cDNA synthesis is initiated using a barcoded (orange) and anchored oligo-dT primer containing an Illumina adapter sequence (green) and T7 promoter sequence (purple) at the 5′ end. After second strand cDNA synthesis, the fully duplex T7 promoter element is used to initiate in vitro transcription and generate cRNA copies of the cDNA with the 5′ Illumina adapter and barcode. Finally, a second Illumina adapter is ligated to the 3′ end of the cRNA. Doing a final RT-PCR amplification completes the construction of the library.
Figure 5
Figure 5. ChIP-seq procedure for detecting sequences at the sites of histone modifications or the recognition sequences of DNA binding proteins
Chromatin is crosslinked, fragmented either by micrococcal nuclease digestion or by sonication, and then incubated with antibodies for either the histone modification or protein of interest. Immunoprecipitation is performed using either Protein A or Protein G beads. After washing, the DNA is uncrosslinked, eluted from the beads and purified, at which point the DNA can be taken into standard DNA library construction protocols.
Figure 6
Figure 6. RNA immunoprecipitation (RIP-seq) done by targeting RNA binding proteins (RBPs)
The basic principle of RIP-seq is immunoprecipitation of RBPs that are bound to target RNA molecules. The RNA molecules are then purified and a sequencing library is created. In some protocols, the RBP complex is chemically crosslinked to the target RNA; that crosslinking must be reversed after immunoprecipitation. We have found that crosslinking is not necessary for simple RIP-seq where the objective is to identify the RNA molecules bound by RBP, but it is required for CLIP-seq protocols that are used to identify the specific sequence motifs for RBP binding. The immunoprecipitation step can be done with antibodies directed at the specific RBP of interest, or the RBP can be tagged and expressed in the cells under study.
Figure 7
Figure 7. Approaches for the study of CpG methylation epigenetics (Methylseq)
A) A combination of methyl-sensitive and methyl-insensitive restriction enzymes can be used to selectively identify and compare the CpG methylation status of specific regions of sequence. B) Antibodies that specifically recognize methylated cytosines can be used to immunoprecipitate DNA fragments, followed by deep sequencing. C) Chemical treatment of DNA with sodium bisulfite results in the conversion of unmethylated cytosines to uracils. In contrast, methylated cytosines are protected. Subsequently, deep sequencing of these libraries reveals the methylation status of individual nucleotides.

References

    1. Quail MA, Kozarewa I, Smith F, Scally A, Stephens PJ, Durbin R, Swerdlow H, Turner DJ. A large genome center’s improvements to the Illumina sequencing system. Nat Methods. 2008;5:1005–1010. - PMC - PubMed
    1. Kozarewa I, Ning Z, Quail MA, Sanders MJ, Berriman M, Turner DJ. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat Methods. 2009;6:291–295. - PMC - PubMed
    1. Hashimshony T, Wagner F, Sher N, Yanai I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2012;2:666–673. - PubMed
    1. Ramsköld D, Luo S, Wang YC, Li R, Deng Q, Faridani OR, Daniels GA, Khrebtukova I, et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol. 2012;30:777–782. - PMC - PubMed
    1. Sasagawa Y, Nikaido I, Hayashi T, Danno H, Uno KD, Imai T, Ueda HR. Quartz-Seq: a highly reproducible and sensitive single-cell RNA sequencing method, reveals non-genetic gene-expression heterogeneity. Genome Biol. 2013;14:R31. - PMC - PubMed

Publication types