Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Nov;3(11):e238.
doi: 10.1371/journal.pcbi.0030238.

Intragenomic matching reveals a huge potential for miRNA-mediated regulation in plants

Affiliations

Intragenomic matching reveals a huge potential for miRNA-mediated regulation in plants

Morten Lindow et al. PLoS Comput Biol. 2007 Nov.

Abstract

microRNAs (miRNAs) are important post-transcriptional regulators, but the extent of this regulation is uncertain, both with regard to the number of miRNA genes and their targets. Using an algorithm based on intragenomic matching of potential miRNAs and their targets coupled with support vector machine classification of miRNA precursors, we explore the potential for regulation by miRNAs in three plant genomes: Arabidopsis thaliana, Populus trichocarpa, and Oryza sativa. We find that the intragenomic matching in conjunction with a supervised learning approach contains enough information to allow reliable computational prediction of miRNA candidates without requiring conservation across species. Using this method, we identify approximately 1,200, approximately 2,500, and approximately 2,100 miRNA candidate genes capable of extensive base-pairing to potential target mRNAs in A. thaliana, P. trichocarpa, and O. sativa, respectively. This is more than five times the number of currently annotated miRNAs in the plants. Many of these candidates are derived from repeat regions, yet they seem to contain the features necessary for correct processing by the miRNA machinery. Conservation analysis indicates that only a few of the candidates are conserved between the species. We conclude that there is a large potential for miRNA-mediated regulatory interactions encoded in the genomes of the investigated plants. We hypothesize that some of these interactions may be realized under special environmental conditions, while others can readily be recruited when organisms diverge and adapt to new niches.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Conceptual Model of Intragenomic Matching
mRNA sequences are matched against the genome and matches are prefiltered. Matches with miRNA precursor potential are selected for further processing.
Figure 2
Figure 2. Overview of the Number of miRNA Candidates at Successive Steps of the Procedure
A genome assembly and a set of annotated mRNA transcripts are input to the intragenomic matching. Intragenomic matching. The result of the intragenomic matching generates “micromatches” consisting of pairs of a genome segment and an mRNA segment. Also shown is the recovery of miRBase 8.2 loci and families. miSVM. Remaining number of miRNA loci and families after miSVM classification is shown (numbers in green). The number of miRNA candidate loci and families not overlapping repeat/CDS regions are shown in blue. miHomology. Conservation filters were applied to detect the subset of miRNA candidates that have at least one homolog in one of the other two organisms. miSquare. The conserved miRNA candidates with the additional requirement of targets orthologs.
Figure 3
Figure 3. The Structural Feature Space of miSVM
Distribution of structural features in the positive (blue) and negative (red) examples used to train miSVM. Arrows illustrate the feature on an example miRNA precursor, with the mature miRNA sequence highlighted in red.
Figure 4
Figure 4. Performance of miSVM
Density of the miSVM score of positive (blue) and negative examples (red).
Figure 5
Figure 5. The Principle of the miSquare Conservation Criteria
When two orthologous miRNAs have at least one instance of orthologous targets in the two organisms, we call this a miSquare.
Figure 6
Figure 6. Distribution of Family Sizes and Target Numbers
miRNA candidates outside coding sequences and repeat regions are counted and density plots constructed. Top row: Distribution of the number of targets per miRNA family. Bottom row: Distributions of family sizes. The conserved candidates generally have larger family sizes.
Figure 7
Figure 7. Conservation of miRNA Candidates and miRBase miRNAs
(A) Species conservation (miHomology) of all candidate miRNA families predicted with miSVM and not overlapping repeat or coding sequence. The Venn diagram shows the number of families that are species specific and those that are conserved within another species (see Materials and Methods). (B) Species conservation (miHomology) of only miRBase (version 8.2) miRNA families (repeat/CDS overlapping families). We only include miRBase miRNAs that can be mapped exactly to the genome according to the reported precursor sequence and where we can predict at least one target.
Figure 8
Figure 8. Distribution of miRNAs in the Genomic Landscape
A histogram for each of the three organisms showing the genomic origin of the miRNAs. The first histogram group in each plot shows the relative abundance of coding (CDS), untranslated (UTR), intron, repeat, and intergenic (IGR) regions in the genome. The second histogram group shows the relative abundance of miRBase miRNAs among these regions, with different colors for sense and antisense overlap. The last three histogram groups capture the same measurements for predicted miSVM, miHomology, and miSquare miRNAs. Novel predicted miRNAs (not found in miRBase) in these groups are illustrated with darker colors, whereas miRBase miRNAs found among our candidates have lighter colors (see legend).
Figure 9
Figure 9. miRNA Candidates Targeting TFs in Arabidopsis
Enrichment of Arabidopsis TF targets in different sets of miRNAs, comparing the relative abundance of TFs among the miRNA targets with the relative abundance of TFs in the Arabidopsis genome (∼5.9%). For the nonfiltered miRNA sets (red), the relative abundance of TF targets are miRBase, 59 of 440; miSVM, 87 of 782; miHomology, 60 of 429; and miSquare, 59 of 408. For the repeat/CDS filtered miRNA sets (green), the numbers are miRBase, 42 of 133; miSVM, 73 of 442; miHomology, 43 of 116; and miSquare, 42 of 103.
Figure 10
Figure 10. miRNA Overlap with Sequenced Small RNAs
Percentage of Arabidopsis miRNAs with 20–23 nt coordinate overlap with sequenced and genome-mapped small RNAs from [36]. Three different sets are shown (all filtered for CDS/repeat overlap). (A) Random 22mers, 21.549 loci sampled randomly from the genome. (B) A set of 1,886 miRNA loci classified as non-miRNAs with miSVM. (C) A set of 334 miRNA loci classified as miRNAs by miSVM.

Similar articles

Cited by

References

    1. Mallory AC, Vaucheret H. Functions of microRNAs and related small RNAs in plants. Nat Genet. 2006;38(Supplement 1):S31–S36. - PubMed
    1. Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, et al. A uniform system for microRNA annotation. RNA. 2003;9:277–279. - PMC - PubMed
    1. Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, et al. The microRNAs of Caenorhabditis elegans . Genes Dev. 2003;17:991–1008. - PMC - PubMed
    1. Lim LP, Glasner ME, Yekta S, Burge CB, Bartel DP. Vertebrate microRNA genes. Science. 2003;299:1540. - PubMed
    1. Bonnet E, Wuyts J, Rouze P, Van de Peer Y. Detection of 91 potential conserved plant microRNAs in Arabidopsis thaliana and Oryza sativa identifies important target genes. Proc Natl Acad Sci U S A. 2004;101:11511–11516. - PMC - PubMed

Publication types

MeSH terms