Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec;139(4):349-359.
doi: 10.1007/s12064-020-00330-6. Epub 2020 Nov 21.

Are spliced ncRNA host genes distinct classes of lncRNAs?

Affiliations

Are spliced ncRNA host genes distinct classes of lncRNAs?

Rituparno Sen et al. Theory Biosci. 2020 Dec.

Abstract

Many small nucleolar RNAs and many of the hairpin precursors of miRNAs are processed from long non-protein-coding host genes. In contrast to their highly conserved and heavily structured payload, the host genes feature poorly conserved sequences. Nevertheless, there is mounting evidence that the host genes have biological functions beyond their primary task of carrying a ncRNA as payload. So far, no connections between the function of the host genes and the function of their payloads have been reported. Here we investigate whether there is evidence for an association of host gene function or mechanisms with the type of payload. To assess this hypothesis we test whether the miRNA host genes (MIRHGs), snoRNA host genes (SNHGs), and other lncRNA host genes can be distinguished based on sequence and/or structure features unrelated to their payload. A positive answer would imply a functional and mechanistic correlation between host genes and their payload, provided the classification does not depend on the presence and type of the payload. A negative answer would indicate that to the extent that secondary functions are acquired, they are not strongly constrained by the prior, primary function of the payload. We find that the three classes can be distinguished reliably when the classifier is allowed to extract features from the payloads. They become virtually indistinguishable, however, as soon as only sequence and structure of parts of the host gene distal from the snoRNAs or miRNA payload is used for classification. This indicates that the functions of MIRHGs and SNHGs are largely independent of the functions of their payloads. Furthermore, there is no evidence that the MIRHGs and SNHGs form coherent classes of long non-coding RNAs distinguished by features other than their payloads.

Keywords: Host gene; LncRNA; Machine learning; MiRNA; Random forest; Secondary structure; SnoRNA; k-mers.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
A schematic of the datasets curated for this study and their distribution over the gene body of a generic host-lncRNA. DS I (green) consists of the payload and 200nt flanking sequence. DS II (red) flanks DS-I by 100nts. DS III consists of the first 100nts of the exon closest to the annotated payload. DS IV consists of non-overlapping 100nt windows taken from random exons of the host-lncRNA. More details can be found in “Sequence retrieval and curation” section

Similar articles

Cited by

References

    1. Abadi M, Barham P, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray D, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) TensorFlow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation, USENIX Association, pp 265–283
    1. Abbas Q, Raza SM, Biyabani AA, Jaffar MA. A review of computational methods for finding non-coding RNA genes. Genes. 2016;7:113. doi: 10.3390/genes7120113. - DOI - PMC - PubMed
    1. Agarwal V, Bell GW, Nam JW, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. eLife. 2015;4:e05005. doi: 10.7554/eLife.05005. - DOI - PMC - PubMed
    1. Agranat-Tamir L, Shomron N, Sperling J, Sperling R. Interplay between pre-mRNA splicing and microRNA biogenesis within the supraspliceosome. Nucl Acids Res. 2014;42:4640–4651. doi: 10.1093/nar/gkt1413. - DOI - PMC - PubMed
    1. Backofen R, Flamm C, Fried C, Fritzsch G, Hackermüller J, Hertel J, Hofacker IL, Missal K, Mosig A, Rose D, Stadler PF, Tanzer A, Washietl S, Sebastian W. RNAs everywhere: genome-wide annotation of structured RNAs. J Exp Zool B Mol Dev Evol. 2007;308B:1–25. doi: 10.1002/jez.b.21130. - DOI - PubMed

MeSH terms

LinkOut - more resources