Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec 29:12:636.
doi: 10.1186/1471-2164-12-636.

Employing machine learning for reliable miRNA target identification in plants

Affiliations

Employing machine learning for reliable miRNA target identification in plants

Ashwani Jha et al. BMC Genomics. .

Abstract

Background: miRNAs are ~21 nucleotide long small noncoding RNA molecules, formed endogenously in most of the eukaryotes, which mainly control their target genes post transcriptionally by interacting and silencing them. While a lot of tools has been developed for animal miRNA target system, plant miRNA target identification system has witnessed limited development. Most of them have been centered around exact complementarity match. Very few of them considered other factors like multiple target sites and role of flanking regions.

Result: In the present work, a Support Vector Regression (SVR) approach has been implemented for plant miRNA target identification, utilizing position specific dinucleotide density variation information around the target sites, to yield highly reliable result. It has been named as p-TAREF (plant-Target Refiner). Performance comparison for p-TAREF was done with other prediction tools for plants with utmost rigor and where p-TAREF was found better performing in several aspects. Further, p-TAREF was run over the experimentally validated miRNA targets from species like Arabidopsis, Medicago, Rice and Tomato, and detected them accurately, suggesting gross usability of p-TAREF for plant species. Using p-TAREF, target identification was done for the complete Rice transcriptome, supported by expression and degradome based data. miR156 was found as an important component of the Rice regulatory system, where control of genes associated with growth and transcription looked predominant. The entire methodology has been implemented in a multi-threaded parallel architecture in Java, to enable fast processing for web-server version as well as standalone version. This also makes it to run even on a simple desktop computer in concurrent mode. It also provides a facility to gather experimental support for predictions made, through on the spot expression data analysis, in its web-server version.

Conclusion: A machine learning multivariate feature tool has been implemented in parallel and locally installable form, for plant miRNA target identification. The performance was assessed and compared through comprehensive testing and benchmarking, suggesting a reliable performance and gross usability for transcriptome wide plant miRNA target identification.

PubMed Disclaimer

Figures

Figure 1
Figure 1
p-TAREF workflow. The figure illustrates the various working stages involved in p-TAREF along with concurrency.
Figure 2
Figure 2
The p-TAREF webserver. The web-server provides a friendly interface to load query sequences, with various parameter settings which include selection of energy cut-off, mismatch level allowed, SVR Kernel to be used, number of processors to be used, etc. Its performance tab contains detailing about all performance measures done for p-TAREF performance benchmarking and comparison with other tools.
Figure 3
Figure 3
Snapshot of standalone GUI version of p-TAREF. Like its web-server counterpart, the standalone GUI version too provides concurrency and most of the features, enabling quick standalone scanning of batch and large amount of sequence data. It also shows a progress bar to intimate about the status of analysis.
Figure 4
Figure 4
Impact of concurrency over execution speed. p-TAREF was run over a set of genes for target identification, with different number of processors added through concurrency. As can be found, concurrency caused drastic reduction in processing time, which is highly beneficial in performing accurate transcriptome wide analysis.
Figure 5
Figure 5
The ROC plots for classifier models of p-TAREF with 10 fold cross validation. As the plots show, the classifier was found robust in performance with high AUC values, where the highest one was observed for polynomial kernel model. For cases A-F, two major experimentally validated data sources, Beuclair et al(2010) and ASRP, were used to prepare the datasets. For cases F and H, tests were performed using the reference test set as well as protocol used by TAPIR and Target-align. The curves represent the following tests: A) Linear Kernel/ASRP B) Gaussian Kernel/ASRP C) Polynomial Kernel/ASRP D) Linear/Beuclair E) Gaussian/Beuclair F) Polynomial/Beuclair G)Target-align/(Tapir/Target align dataset) H) p-TAREF(Tapir/Target-Align dataset).
Figure 6
Figure 6
miRNAs target distribution in Oryza sativa. The major miRNA families found targeting the various genes in rice transcriptome.
Figure 7
Figure 7
Graphical representation of targets of miR156 in rice transcriptome. All the targets shown here scored inverse expression correlation with miR156, having absolute value of 0.8 or higher.
Figure 8
Figure 8
Hypergeometric tests for enrichment of GO functional categories terms for molecular function. The observation was made for enrichment of molecular functions found enriched and associated with targets of miR156. The colored nodes are functional categories whose genes were found significantly enriched in the pool of miR156 targets. Darker the color, more significant is the enrichment.

Similar articles

Cited by

References

    1. Rhoades MW, Reinhart BJ, Lim LP, Burge CB, Bartel B, Bartel DP. Prediction of plant microRNA targets. Cell. 2002;110:513–520. doi: 10.1016/S0092-8674(02)00863-2. - DOI - PubMed
    1. Dugas DV, Bartel B. Sucrose induction of Arabidopsis miR398 represses two Cu/Zn superoxide dismutases. Plant Mol Biol. 2008;67:403–417. doi: 10.1007/s11103-008-9329-1. - DOI - PubMed
    1. Brodersen P, Sakvarelidze-Achard L, Bruun-Rasmussen M, Dunoyer P, Yamamoto YY, Sieburth L, Voinnet O. Widespread Translational Inhibition by Plant miRNAs and siRNAs. Science. 2008;320:1185–1190. doi: 10.1126/science.1159151. - DOI - PubMed
    1. Lanet E, Delannoy E, Sormani R, Floris M, Brodersen P, Cre' te' P, Voinnet O, Robaglia C. Biochemical Evidence for Translational Repression by Arabidopsis MicroRNAs. Plnat cell. 2009;21:1762–1768. doi: 10.1105/tpc.108.063412. - DOI - PMC - PubMed
    1. Beauclair L, Yu A, Bouché N. microRNA-directed cleavage and translational repression of the copper chaperone for superoxide dismutase mRNA in Arabidopsis. Plant J. 2010;62:454–462. doi: 10.1111/j.1365-313X.2010.04162.x. - DOI - PubMed

Publication types

LinkOut - more resources