Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 31;16(4):413.
doi: 10.3390/genes16040413.

circ2LO: Identification of CircRNA Based on the LucaOne Large Model

Affiliations

circ2LO: Identification of CircRNA Based on the LucaOne Large Model

Haihao Yu et al. Genes (Basel). .

Abstract

Circular RNA is a type of noncoding RNA with a special covalent bond structure. As an endogenous RNA in animals and plants, it is formed through RNA splicing. The 5' and 3' ends of the exons form circular RNA at the back-splicing sites. Circular RNA plays an important regulatory role in diseases by interacting with the associated miRNAs. Accurate identification of circular RNA can enrich the data on circular RNA and provide new ideas for drug development. At present, mainstream circular RNA recognition algorithms are divided into two categories: those based on RNA sequence position information and those based on RNA sequence biometric information. Herein, we propose a method for the recognition of circular RNA, called circ2LO, which utilizes the LucaOne large model for feature embedding of the splicing sites of RNA sequences as well as their upstream and downstream sequences to prevent semantic information loss caused by the traditional one-hot encoding method. Subsequently, it employs a convolutional layer to extract features and a self-attention mechanism to extract interactive features to accurately capture the core features of the circular RNA at the splicing sites. Finally, it uses a fully connected layer to identify circular RNA. The accuracy of circ2LO on the human dataset reached 95.47%, which is higher than the values shown by existing methods. It also achieved accuracies of 97.04% and 72.04% on the Arabidopsis and mouse datasets, respectively, demonstrating good robustness. Through rigorous validation, the circ2LO model has proven its high-precision identification capability for circular RNAs, marking it as a potentially transformative analytical platform in the circRNA research field.

Keywords: circRNA; deep learning; large language model; self-attention mechanism; transformer.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflicts of interest.

Figures

Figure 1
Figure 1
Overall model structure of the proposed method for circRNA identification based on the LucaOne large model. x1,x2xn1,xn denote an embedded representation of each nucleotide in an RNA sequence.
Figure 2
Figure 2
Implementation process, from metadata to obtaining the vector representation of RNA sequence embeddings.
Figure 3
Figure 3
Schematic of the attention mechanism.
Figure 4
Figure 4
Results of the quantitative analysis of the key modules.
Figure 5
Figure 5
Effect of the number of convolutional layers on the model’s performance.
Figure 6
Figure 6
Effect of learning rate on model accuracy.
Figure 7
Figure 7
Effect of the number of linear layers on the model’s performance.

Similar articles

Cited by

References

    1. Qu S., Yang X., Li X., Wang J., Gao Y., Shang R., Sun W., Dou K., Li H. Circular RNA: A new star of noncoding RNAs. Cancer Lett. 2015;365:141–148. doi: 10.1016/j.canlet.2015.06.003. - DOI - PubMed
    1. Belter A., Popenda M., Sajek M., Woźniak T., Naskręt-Barciszewska M.Z., Szachniuk M., Jurga S., Barciszewski J. A new molecular mechanism of RNA circularization and the microRNA sponge formation. J. Biomol. Struct. Dyn. 2022;40:3038–3045. - PubMed
    1. Hansen T.B., Jensen T.I., Clausen B.H., Bramsen J.B., Finsen B., Damgaard C.K., Kjems J. Natural RNA circles function as efficient microRNA sponges. Nature. 2013;495:384–388. doi: 10.1038/nature11993. - DOI - PubMed
    1. Wang Y., Wang Z. Efficient backsplicing produces translatable circular mRNAs. RNA—Publ. RNA Soc. 2015;21:172–179. doi: 10.1261/rna.048272.114. - DOI - PMC - PubMed
    1. He Y., Fang P., Shan Y., Pan Y., Wei Y., Chen Y., Chen Y., Liu Y., Zeng Z., Zhou Z., et al. LucaOne: Generalized biological foundation model with unified nucleic acid and protein language. bioRxiv. 2024 doi: 10.1101/2024.05.10.592927. - DOI

LinkOut - more resources