Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 8;9(8):2511-2520.
doi: 10.1534/g3.119.400201.

Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships

Affiliations

Molecular Traits of Long Non-protein Coding RNAs from Diverse Plant Species Show Little Evidence of Phylogenetic Relationships

Caitlin M A Simopoulos et al. G3 (Bethesda). .

Abstract

Long non-coding RNAs (lncRNAs) represent a diverse class of regulatory loci with roles in development and stress responses throughout all kingdoms of life. LncRNAs, however, remain under-studied in plants compared to animal systems. To address this deficiency, we applied a machine learning prediction tool, Classifying RNA by Ensemble Machine learning Algorithm (CREMA), to analyze RNAseq data from 11 plant species chosen to represent a wide range of evolutionary histories. Transcript sequences of all expressed and/or annotated loci from plants grown in unstressed (control) conditions were assembled and input into CREMA for comparative analyses. On average, 6.4% of the plant transcripts were identified by CREMA as encoding lncRNAs. Gene annotation associated with the transcripts showed that up to 99% of all predicted lncRNAs for Solanum tuberosum and Amborella trichopoda were missing from their reference annotations whereas the reference annotation for the genetic model plant Arabidopsis thaliana contains 96% of all predicted lncRNAs for this species. Thus a reliance on reference annotations for use in lncRNA research in less well-studied plants can be impeded by the near absence of annotations associated with these regulatory transcripts. Moreover, our work using phylogenetic signal analyses suggests that molecular traits of plant lncRNAs display different evolutionary patterns than all other transcripts in plants and have molecular traits that do not follow a classic evolutionary pattern. Specifically, GC content was the only tested trait of lncRNAs with consistently significant and high phylogenetic signal, contrary to high signal in all tested molecular traits for the other transcripts in our tested plant species.

Keywords: CREMA; RNASeq; evolution; lncRNA; phylogenetic signal.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Total predicted lncRNAs from 10 plant species. The counts of putative lncRNAs are categorized by transcripts that appear in the reference annotation of each species (purple) and novel transcripts, or those that did not appear in transcriptome annotation (coral). The percentages of novel transcripts (coral) predicted as lncRNAs appear above each bar.
Figure 2
Figure 2
Mean trait values of transcripts predicted as lncRNAs (coral, circle) and all other assembled transcripts (purple, triangle). Species are ordered as per phylogenetic relationships.
Figure 3
Figure 3
Moran’s I local correlogram of mean trait values in lncRNAs and All Other Transcripts. Coral points indicate significant phylogenetic signal at a particular phylogenetic distance. The horizontal line represents a value of the null hypothesis that no phylogenetic signal is detected. The null hypothesis value is -0.111, or 1/(n1) where n=10, or the number of tested species. The 95% confidence intervals, computed using bootstrapping, are also plotted and were used to identify significant values.

Similar articles

Cited by

References

    1. Amborella Genome Project , 2013. The Amborella genome and the evolution of flowering plants. Science 342: 1241089 10.1126/science.1241089 - DOI - PubMed
    1. Banks J., Nishiyama T., Hasebe M., Bowman J., Gribskov M. et al. , 2011. The Selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science 332: 960–963. 10.1126/science.1203810 - DOI - PMC - PubMed
    1. Blomberg S., Garland T. Jr., and Ives A. R., 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution 57: 717–745. 10.1111/j.0014-3820.2003.tb00285.x - DOI - PubMed
    1. Bolger A., Lohse M., and Usadel B., 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. 10.1093/bioinformatics/btu170 - DOI - PMC - PubMed
    1. Bolger M., Arsova B., and Usadel B., 2018. Plant genome and transcriptome annotations: from misconceptions to simple solutions. Brief. Bioinform. 19: 437–449. - PMC - PubMed

Publication types

LinkOut - more resources