Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Apr 7;44(6):e53.
doi: 10.1093/nar/gkv1335. Epub 2015 Dec 3.

Prioritizing and selecting likely novel miRNAs from NGS data

Affiliations

Prioritizing and selecting likely novel miRNAs from NGS data

Christina Backes et al. Nucleic Acids Res. .

Abstract

Small non-coding RNAs play a key role in many physiological and pathological processes. Since 2004, miRNA sequences have been catalogued in miRBase, which is currently in its 21st version. We investigated sequence and structural features of miRNAs annotated in the miRBase and compared them between different versions of this reference database. We have identified that the two most recent releases (v20 and v21) are influenced by next-generation sequencing based miRNA predictions and show significant deviation from miRNAs discovered prior to the high-throughput profiling period. From the analysis of miRBase, we derived a set of key characteristics to predict new miRNAs and applied the implemented algorithm to evaluate novel blood-borne miRNA candidates. We carried out 705 individual whole miRNA sequencings of blood cells and collected a total of 9.7 billion reads. Using miRDeep2 we initially predicted 1452 potentially novel miRNAs. After excluding false positives, 518 candidates remained. These novel candidates were ranked according to their distance to the features in the early miRBase versions allowing for an easier selection of a subset of putative miRNAs for validation. Selected candidates were successfully validated by qRT-PCR and northern blotting. In addition, we implemented a web-server for ranking potential miRNA candidates, which is available at:www.ccb.uni-saarland.de/novomirank.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
For different features, the distribution across different miRBase versions is presented as Box-Whisker plots. The novel miRNAs discovered in our study are included in grey. The miRBase version sets are numbered according to Table 1.
Figure 2.
Figure 2.
Principal Component Analysis. The early miRBase versions in red (sets I + II, see also Table 1) fit well to each other and show a central cluster. The middle versions of the miRBase in blue (III + IV) still fit nicely to these initial miRNA precursors, while the newer versions (V + VI) and the miRDeep predicted precursors in green and grey scatter at the edge of the distribution. POV: proportion of variance.
Figure 3.
Figure 3.
Histogram blot of the absolute value of average z-scores from early versions of miRBase. With increasing version the distance from the initial miRNA precursors increases significantly.
Figure 4.
Figure 4.
Selected examples of secondary structures for miRNA precursors having a good score in our ranked list. Each panel presents one miRNA precursor along with the 5p- and 3p-miRNA in orange and blue. Additionally, the mature sequences and the overall distance (score) from the reference distribution (miRBase v1–7) is provided, as well as the summarized normalized base counts (per 10 million reads) over all samples for these precursors are illustrated.
Figure 5.
Figure 5.
Selected examples of secondary structures for miRNA precursors having a bad score in our ranked list. Each panel presents one miRNA precursor along with the 5p- and 3p-miRNA in orange and blue. Additionally, the mature sequences and the overall distance (score) from the reference distribution (miRBase v1–7) is provided, as well as the summarized normalized base counts (per 10 million reads) over all samples for these precursors are illustrated. Panel A shows the most divergent miRNA precursor according to our score. Panel B shows the miRNA precursor with overall highest length.
Figure 6.
Figure 6.
Validation of novel miRNAs by qRT-PCR and northern blots. Panel A shows amplification products of qRT-PCR in three RNA Pools (P1-P3) on Bioanalyzer DNA 1000 Chip, Panel B on conventional 3% agarose gels. Negative controls included a no template control for reverse transcription (NTRT), a RT reaction without enzyme (RT-) and a no template PCR control for each specific primer (NTC). As the used qRT-PCR system depends on poly-adenylation at the 3′ end of mature miRNAs followed by reverse transcription using an oligo-dT primer that includes a universal tag sequence for the qPCR, amplification products of mature miRNAs are ≈80–95 bps depending on the number of A′s added to the miRNA sequence. The ladder bands shown represent 50 and 100 bps. For 11 miRNAs specific bands at 80–90 bps could be detected. All PCR products were subcloned into pGEM and Sanger sequenced (see Supplementary Table S4) to verify specific amplification of novel miRNAs. Panels C and D show northern blots detecting mature miRs-1005–5p (C) and -3p (D) with sequence specific radio-labelled probes (left side) in HEK293T cells transfected with pSG5 vector with inserted mir-1005 precursor sequence. The right size of the novel mature miRNAs was confirmed by the stripping and rehybridization of both nylon membranes with specific radio-labelled probes of the high confident miR-20a-5p (right side). Loading control demonstrates equal RNA amounts in all lanes.

References

    1. Lee R.C., Feinbaum R.L., Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993;75:843–854. - PubMed
    1. Mendes N.D., Freitas A.T., Sagot M.F. Current tools for the identification of miRNA genes and their targets. Nucleic Acids Res. 2009;37:2419–2433. - PMC - PubMed
    1. Lim L.P., Lau N.C., Weinstein E.G., Abdelhakim A., Yekta S., Rhoades M.W., Burge C.B., Bartel D.P. The microRNAs of Caenorhabditis elegans. Genes Dev. 2003;17:991–1008. - PMC - PubMed
    1. Lai E.C., Tomancak P., Williams R.W., Rubin G.M. Computational identification of Drosophila microRNA genes. Genome Biol. 2003;4:R42. - PMC - PubMed
    1. Hackenberg M., Sturm M., Langenberger D., Falcon-Perez J.M., Aransay A.M. miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments. Nucleic Acids Res. 2009;37:W68–W76. - PMC - PubMed

Publication types

LinkOut - more resources