Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Jun 24;33(11):3570-81.
doi: 10.1093/nar/gki668. Print 2005.

Human microRNA prediction through a probabilistic co-learning model of sequence and structure

Affiliations

Human microRNA prediction through a probabilistic co-learning model of sequence and structure

Jin-Wu Nam et al. Nucleic Acids Res. .

Abstract

MicroRNAs (miRNAs) are small regulatory RNAs of approximately 22 nt. Although hundreds of miRNAs have been identified through experimental complementary DNA cloning methods and computational efforts, previous approaches could detect only abundantly expressed miRNAs or close homologs of previously identified miRNAs. Here, we introduce a probabilistic co-learning model for miRNA gene finding, ProMiR, which simultaneously considers the structure and sequence of miRNA precursors (pre-miRNAs). On 5-fold cross-validation with 136 referenced human datasets, the efficiency of the classification shows 73% sensitivity and 96% specificity. When applied to genome screening for novel miRNAs on human chromosomes 16, 17, 18 and 19, ProMiR effectively searches distantly homologous patterns over diverse pre-miRNAs, detecting at least 23 novel miRNA gene candidates. Importantly, the miRNA gene candidates do not demonstrate clear sequence similarity to the known miRNA genes. By quantitative PCR followed by RNA interference against Drosha, we experimentally confirmed that 9 of the 23 representative candidate genes express transcripts that are processed by the miRNA biogenesis enzyme Drosha in HeLa cells, indicating that ProMiR may successfully predict miRNA genes with at least 40% accuracy. Our study suggests that the miRNA gene family may be more abundant than previously anticipated, and confer highly extensive regulatory networks on eukaryotic cells.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pairwise representation of stem–loop structures and state sequences of pre-miRNAs, where the state of each pair includes structural information and mature miRNA region information (hidden states). (a) The structure of the pre-miRNA. (b) The transition and emission scheme of the structural states and the hidden states for pairwise sequence in the dotted rectangle shown in (a). T0M, TDM, TMN and TMI are transition probabilities. EM(GU), ED(−C), EM(GC), EN(UU), EM(GU) and EI(U−) are emission probabilities. (c) The four-state finite state automaton. Finally, the probability of the pairwise sequence is assigned by multiplication of the transition probabilities and the emission probabilities.
Figure 2
Figure 2
The ROC curve, which is defined as a plot of test sensitivity as the y-coordinate, versus the false positive rate (FPR; 1 − specificity) as the x-coordinate. The area under the ROC curve is 0.936 by non-parametric estimation. The arrow indicates the point of threshold, where P = 0.033, specificity is 96% and sensitivity is 73%.
Figure 3
Figure 3
Flow chart for human miRNA gene finding. (a) The program, HMmiRNApairwise using an RNAfold algorithm extracts extended stem–loops with several criteria described in the Supplementary Material; (b) human EST database search; (c) ProMiR predicts pre-miRNA candidates, the region of mature miRNA and the location of a functional strand; (d) screening by additional evidence—MFE values, vertebrate conservations and negative evidences; (e) experimental verification.
Figure 4
Figure 4
Experimental verification of the candidates predicted by ProMiR. (a) Comparison of the expected PCR results for the candidates in control and Drosha knockdown HeLa cells (b) The left gel image shows the PCR intensity of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) for each cDNA. The first lane is the PCR result following silencing (si) RNA treatment for luciferase in HeLa cells for 3 days as the control and the second lane is the PCR result following siRNA for luciferase in HeLa cells for 3 days. (c) NC16-1 and 2, and NC19-5 and 6 are on a transcript that contains two pre-miRNAs, respectively.
Figure 5
Figure 5
Phylogenetic tree for the nine new pre-miRNAs, several published miRNAs and other ncRNAs. The tree was drawn using the neighbor-joining method. The gray boxes in sequences indicate closely homologous (orthologs + paralogs) members. The dotted line box contains other ncRNA sequences, i.e. tRNAs and scAlu RNA. The new pre-miRNAs have distant homologous patterns to published pre-miRNAs—‘a’ homologous group shows distantly homologous pattern with has-mir-193 and the ‘b’ homologous group is a distant homolog to has-mir-108; however, they are relatively closer together than other ncRNAs.
Figure 6
Figure 6
The differences of threshold cycle (CT) between control and Drosha knocked down cells.
Figure 7
Figure 7
Permutation test for the structure and sequence of mature miRNA. The solid line indicates the change of probability P according to base permutation. The dotted line indicates the change of probability according to base pair permutation.

Similar articles

Cited by

References

    1. Bartel D.P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116:281–297. - PubMed
    1. Kim V.N. Small RNAs: classification, biogenesis, and function. Mol. Cells. 2005;19:1–15. - PubMed
    1. Lee Y., Kim M., Han J., Yeom K.H., Lee S., Baek S.H., Kim V.N. MicroRNA genes are transcribed by RNA polymerase II. EMBO J. 2004;23:4051–4060. - PMC - PubMed
    1. Cai X., Hagedorn C.H., Cullen B.R. Human microRNAs are processed from capped, polyadenylated transcripts that can also function as mRNAs. RNA. 2004;10:1957–1966. - PMC - PubMed
    1. Lee Y., Ahn C., Han J., Choi H., Kim J., Yim J., Lee J., Provost P., Radmark O., Kim S., Kim V.N. The nuclear RNase III Drosha initiates microRNA processing. Nature. 2003;425:415–419. - PubMed

Publication types