Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct;124(1):e70509.
doi: 10.1111/tpj.70509.

LncRAnalyzer: a robust workflow for long non-coding RNA discovery using RNA-Seq

Affiliations

LncRAnalyzer: a robust workflow for long non-coding RNA discovery using RNA-Seq

Shinde Nikhil et al. Plant J. 2025 Oct.

Abstract

Long non-coding RNA (lncRNA) is a major transcript category that lacks protein-coding capabilities, with relatively low abundance and complex expression patterns. Distinguishing lncRNAs from protein-coding genes is a complex process involving multiple filtering steps. We developed an automated pipeline named LncRAnalyzer featuring retrained models for 60 species. This workflow aims to reduce the likelihood of obtaining protein-coding or partial protein-coding transcripts during lncRNA identification by utilizing eight distinct approaches. We conducted a 10-fold cross-validation of the sorghum models and training sets with their standard ones and other approaches using real-life RNA-Seq datasets and known lncRNA and CDS sequences of sorghum. The results showed that the sorghum models and training sets were outperformed. The pipeline output comprises upset plots illustrating the number of lncRNA/NPCTs identified by the approaches, commonly identified lncRNA and their classes, NPCTs, and expression count tables. A feature-level comparison and benchmarking analysis of LncRAnalyzer with four existing pipelines, namely, LncPipe, LncEvo, lncRNA-Annotation, and Plant-LncPipe, demonstrated that LncRAnalyzer is more comprehensive, easier to implement, and accurate in lncRNA predictions. This workflow also ascertains lncRNA origins from various Transposable Elements (TEs) in plants using TE annotations from APTEdb [http://apte.cp.utfpr.edu.br/]. LncRAnalyzer is publicly available on GitLab [https://gitlab.com/nikhilshinde0909/LncRAnalyzer.git] for academic users.

Keywords: LncRAnalyzer; RNA‐Seq; Sorghum bicolor; genomics; long non‐coding RNA.

PubMed Disclaimer

References

    1. Ammunét, T., Wang, N., Khan, S. & Elo, L.L. (2022) Deep learning tools are top performers in long non‐coding RNA prediction. Briefings in Functional Genomics, 21, 230–241.
    1. Ashiwal, P., Tripathi, P. & Miri, R. (2016) Web information retrieval using python and BeautifulSoup. International Journal for Research in Applied Science and Engineering Technology, 4.
    1. Boutet, E., Lieberherr, D., Tognolli, M., Schneider, M., Bansal, P., Bridge, A.J. et al. (2016) Uniprotkb/swiss‐prot, the manually annotated section of the uniprot knowledgebase: how to use the entry view. Methods in Molecular Biology, 1374, 23–54.
    1. Bryzghalov, O., Makałowska, I. & Szcześniak, M.W. (2021) lncEvo: automated identification and conservation study of long noncoding RNAs. BMC Bioinformatics, 22, 59.
    1. Buske, F.A., Bauer, D.C., Mattick, J.S. & Bailey, T.L. (2012) Triplexator: detecting nucleic acid triple helices in genomic and transcriptomic data. Genome Research, 22, 1372–1381.

LinkOut - more resources