Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 18;11 Suppl 1(Suppl 1):S29.
doi: 10.1186/1471-2105-11-S1-S29.

Prediction of novel precursor miRNAs using a context-sensitive hidden Markov model (CSHMM)

Affiliations

Prediction of novel precursor miRNAs using a context-sensitive hidden Markov model (CSHMM)

Sumeet Agarwal et al. BMC Bioinformatics. .

Abstract

Background: It has been apparent in the last few years that small non coding RNAs (ncRNA) play a very significant role in biological regulation. Among these microRNAs (miRNAs), 22-23 nucleotide small regulatory RNAs, have been a major object of study as these have been found to be involved in some basic biological processes. So far about 706 miRNAs have been identified in humans alone. However, it is expected that there may be many more miRNAs encoded in the human genome. In this report, a "context-sensitive" Hidden Markov Model (CSHMM) to represent miRNA structures has been proposed and tested extensively. We also demonstrate how this model can be used in conjunction with filters as an ab initio method for miRNA identification.

Results: The probabilities of the CSHMM model were estimated using known human miRNA sequences. A classifier for miRNAs based on the likelihood score of this "trained" CSHMM was evaluated by: (a) cross-validation estimates using known human sequences, (b) predictions on a dataset of known miRNAs, and (c) prediction on a dataset of non coding RNAs. The CSHMM is compared with two recently developed methods, miPred and CID-miRNA. The results suggest that the CSHMM performs better than these methods. In addition, the CSHMM was used in a pipeline that includes filters that check for the presence of EST matches and the presence of Drosha cutting sites. This pipeline was used to scan and identify potential miRNAs from the human chromosome 19. It was also used to identify novel miRNAs from small RNA sequences of human normal leukocytes obtained by the Deep sequencing (Solexa) methodology. A total of 49 and 308 novel miRNAs were predicted from chromosome 19 and from the small RNA sequences respectively.

Conclusion: The results suggest that the CSHMM is likely to be a useful tool for miRNA discovery either for analysis of individual sequences or for genome scan. Our pipeline, consisting of a CSHMM and filters to reduce false positives shows promise as an approach for ab initio identification of novel miRNAs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The context-sensitive HMM proposed to represent miRNA precursors with estimated transition probabilities. State P1 emits the upper halves of the stem and symmetric bulges. States S1 and S3 emit the asymmetric bulges in the upper and lower sections respectively. State S2 emits the loop. States C11 and C12 emit the lower halves of the stem and symmetric bulges respectively (~ refers to probabilities averaged over the four possible top-of-stack symbols).
Figure 2
Figure 2
Receiver-Operating Characteristic (ROC) curve for the CSHMM classifier on the test set. Classification was done for a range of thresholds on the likelihood score, and true and false positive rates computed for each case. The point in red shows the results of the 'optimal' threshold, as determined by entropy minimization, and corresponds to the results reported in Table 2(a).

Similar articles

Cited by

References

    1. Mendes ND, Freitas AT, Sagot MF. Current tools for the identification of miRNA genes and their targets. Nucleic Acids Res. 2009. in press . - PMC - PubMed
    1. Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, Rhoades MW, Burge CB, Bartel DP. The microRNAs of Caenorhabditis elegans. Genes Dev. 2003;17(8):991–1008. doi: 10.1101/gad.1074403. - DOI - PMC - PubMed
    1. Legendre M, Lambert A, Gautheret D. Profile-based detection of microRNA precursors in animal genomes. Bioinformatics. 2005;21(7):841–845. doi: 10.1093/bioinformatics/bti073. - DOI - PubMed
    1. Wang X, Zhang J, Li F, Gu J, He T, Zhang X, Li Y. MicroRNA identification based on sequence and structure alignment. Bioinformatics. 2005;21(18):3610–3614. doi: 10.1093/bioinformatics/bti562. - DOI - PubMed
    1. Nam JW, Shin KR, Han J, Lee Y, Kim VN, Zhang BT. Human microRNA prediction through a probabilistic co-learning model of sequence and structure. Nucleic Acids Res. 2005;33(11):3570–3581. doi: 10.1093/nar/gki668. - DOI - PMC - PubMed

Publication types