Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Dec 22:6:352.
doi: 10.3389/fgene.2015.00352. eCollection 2015.

Addressing Bias in Small RNA Library Preparation for Sequencing: A New Protocol Recovers MicroRNAs that Evade Capture by Current Methods

Affiliations

Addressing Bias in Small RNA Library Preparation for Sequencing: A New Protocol Recovers MicroRNAs that Evade Capture by Current Methods

Jeanette Baran-Gale et al. Front Genet. .

Abstract

Recent advances in sequencing technology have helped unveil the unexpected complexity and diversity of small RNAs. A critical step in small RNA library preparation for sequencing is the ligation of adapter sequences to both the 5' and 3' ends of small RNAs. Studies have shown that adapter ligation introduces a significant but widely unappreciated bias in the results of high-throughput small RNA sequencing. We show that due to this bias the two widely used Illumina library preparation protocols produce strikingly different microRNA (miRNA) expression profiles in the same batch of cells. There are 102 highly expressed miRNAs that are >5-fold differentially detected and some miRNAs, such as miR-24-3p, are over 30-fold differentially detected. While some level of bias in library preparation is not surprising, the apparent massive differential bias between these two widely used adapter sets is not well appreciated. In an attempt to mitigate this bias, the new Bioo Scientific NEXTflex V2 protocol utilizes a pool of adapters with random nucleotides at the ligation boundary. We show that this protocol is able to detect robustly several miRNAs that evade capture by the Illumina-based methods. While these analyses do not indicate a definitive gold standard for small RNA library preparation, the results of the NEXTflex protocol do correlate best with RT-qPCR. As increasingly more laboratories seek to study small RNAs, researchers should be aware of the extent to which the results may differ with different protocols, and should make an informed decision about the protocol that best fits their study.

Keywords: adapter dimers; adapter ligation bias; microRNA; sequencing; small RNA library preparation.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Key differences among different commercially available library preparation kits for small RNA sequencing. Some of the innovations in small RNA library preparation are highlighted here. First, current kits either use fixed adapter sequences or they introduce degenerate bases to both the 3′ and 5′ ligation boundary to improve adapter ligation efficiency. Second, adapter dimers can be generated causing a portion of sequenced reads to contain no insert. These dimers can be blocked or removed, thus increasing effective sequencing depth. Note: orange boxes indicating degenerate bases are not depicted in the adapter dimer graphic for the sake of simplicity.
FIGURE 2
FIGURE 2
Comparison of miRNA expression profiles among different library preparation protocols reveals massive differential bias. A comparison of the following four methods is shown: Illumina v1.5 library preparation sequenced on GAIIx platform (v1.5_GAIIx), Illumina TruSeq library preparation sequenced on GAIIx platform (TS_GAIIx), Illumina TruSeq library preparation sequenced on HiSeq platform (TS_HiSeq) and Bioo Scientific NEXTflexV2 library preparation sequenced on the HiSeq platform (NF-HiSeq). Three biological replicate small RNA libraries were generated for each of the first three methods and one replicate was generated for the NF-HiSeq method. (A) Correlation of miRNA profiles between each pair of datasets (correlation values were calculated by Pearson’s metric). Similar results were obtained with Spearman’s correlation coefficient, rho (data not shown). White and blue colors indicate strongest and weakest correlation, respectively. (B) miRNA expression profiles across all 10 samples. Hierarchical clustering was used to identify samples with closely related expression profiles. Expression is represented as z-score, indicating the number of standard deviations below (purple) or above (orange) the mean across all ten libraries. Both (A,B) used only the set of miRNAs identified as “highly expressed” (n = 358).
FIGURE 3
FIGURE 3
Fifty of the most abundant miRNAs are greater than ten-fold differentially detected between Illumina v1.5 and TruSeq. (A) Comparison of relative expression levels of miRNAs in MIN6 (n = 358) between the GAIIx and HiSeq sequencing platforms with libraries prepared by TruSeq (TS) is shown. Each data point represents the average relative expression level for an individual miRNA across three biological replicates. (B) Comparison of relative expression levels of miRNAs in MIN6 (n = 358) between the v1.5 and TruSeq (TS) library preparation methods is shown. Each data point represents the average relative expression level for an individual miRNA across three biological replicates. (C,D) Comparison of relative expression levels of miRNAs in MIN6 (n = 358, C) and mouse liver (n = 178, D) between the TruSeq (TS) and NEXTflex (NF) library preparation methods is shown. Each data point represents the average relative expression level for an individual miRNA across three biological replicates (A,B), or one biological replicate (C,D). Relative miRNA expression levels were calculated according to the following: log10 (mean(miRNA RPMM)), where RPMM is reads per million mapped reads. Pearson correlation values are displayed in red text within each panel, and gray dashed lines denote 10-fold differential expression.
FIGURE 4
FIGURE 4
Measurements by quantitative PCR are best correlated with NEXTflex V2. (A) Comparison of relative expression levels of four miRNAs (miR-24-3p, miR-27b-3p, miR-29a-3p, and miR-375-3p) across five different methods of miRNA detection is shown. (B) Regression analysis of the relative expression of four miRNAs for each pair of detection methods is shown. The linear regression line is shown below the diagonal and the linear model parameters are shown above the diagonal. miRNA expression levels were normalized to miR-30e-5p, which represents a housekeeping miRNA due to its invariance and robust expression across most datasets. Linear model parameters: α = intercept, β = coefficient, σ2 = squared residual error, R2 = fraction of variance explained by model.

References

    1. Alon S., Vigneault F., Eminaga S., Christodoulou D. C., Seidman J. G., Church G. M., et al. (2011). Barcoding bias in high-throughput multiplex sequencing of miRNA. Genome Res. 21 1506–1511. 10.1101/gr.121715.111 - DOI - PMC - PubMed
    1. Baker M. (2010). MicroRNA profiling: separating signal from noise. Nat. Methods 7 687–692. 10.1038/nmeth0910-687 - DOI - PubMed
    1. Bandiera S., Pfeffer S., Baumert T. F., Zeisel M. B. (2015). miR-122–a key factor and therapeutic target in liver disease. J. Hepatol. 62 448–457. 10.1016/j.jhep.2014.10.004 - DOI - PubMed
    1. Baran-Gale J., Fannin E. E., Kurtz C. L., Sethupathy P. (2013). Beta cell 5’-shifted isomiRs are candidate regulatory hubs in type 2 diabetes. PLoS ONE 8:e73240 10.1371/journal.pone.0073240 - DOI - PMC - PubMed
    1. Bartel D. P. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136 215–233. 10.1016/j.cell.2009.01.002 - DOI - PMC - PubMed