Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr;11(14):e2308115.
doi: 10.1002/advs.202308115. Epub 2024 Feb 2.

CIRI-Deep Enables Single-Cell and Spatial Transcriptomic Analysis of Circular RNAs with Deep Learning

Affiliations

CIRI-Deep Enables Single-Cell and Spatial Transcriptomic Analysis of Circular RNAs with Deep Learning

Zihan Zhou et al. Adv Sci (Weinh). 2024 Apr.

Abstract

Circular RNAs (circRNAs) are a crucial yet relatively unexplored class of transcripts known for their tissue- and cell-type-specific expression patterns. Despite the advances in single-cell and spatial transcriptomics, these technologies face difficulties in effectively profiling circRNAs due to inherent limitations in circRNA sequencing efficiency. To address this gap, a deep learning model, CIRI-deep, is presented for comprehensive prediction of circRNA regulation on diverse types of RNA-seq data. CIRI-deep is trained on an extensive dataset of 25 million high-confidence circRNA regulation events and achieved high performances on both test and leave-out data, ensuring its accuracy in inferring differential events from RNA-seq data. It is demonstrated that CIRI-deep and its adapted version enable various circRNA analyses, including cluster- or region-specific circRNA detection, BSJ ratio map visualization, and trans and cis feature importance evaluation. Collectively, CIRI-deep's adaptability extends to all major types of RNA-seq datasets including single-cell and spatial transcriptomic data, which will undoubtedly broaden the horizons of circRNA research.

Keywords: circular RNA; deep learning; single cell RNA‐seq; spatial transcriptome; splicing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Deep learning‐based differentially spliced circRNA (DSC) events prediction model and its applications. A,B) Overview of training data for CIRI‐deep. We collected 397 human RNA‐seq data (total RNA, sequencing depth > 100 M) from circAtlas and RNA Atlas and applied CIRIquant to quantify the junction ratio of circRNAs of each sample. Each sample pair is analyzed by DARTS BHT to generate high‐confidence differentially or unchanged spliced circRNA events. Number of samples and circRNAs of each tissue, sample pairs and events between each tissue pair are shown in the heatmap. C) Schematic framework of CIRI‐deep. CIRI‐deep is trained on cis features of circRNAs and RBP expression of sample pairs (total RNA or poly(A)‐enriched RNA) through a deep neural network. Outputs of CIRI‐deep trained on total RNA RBP expression level and poly(A) selected data RBP expression level are probability of the circRNA being differentially spliced and probability of the circRNA being unchanged, of higher junction ratio in sample a, or of higher junction ratio in sample b, respectively.
Figure 2
Figure 2
CIRI‐deep accurately predicts differentially‐spliced circRNA. A) In total, 39896 sample pairs are used for training CIRI‐deep, and 100 sample pairs are split out as leave‐out sample pairs. For each sample pair in training sample pairs, 1% of the events are split out as test events. B) Performance on test data (left) and leave‐out data (right). For test data, sample pairs with more than 10 test events are plotted. Y‐axis and X‐axis represent number of sample pairs and AUROC in each sample pair. AUROC for whole test events and leave‐out events are labeled and plotted as dash‐line. C) Generalization on leave‐out sample pair (left 2) and public data (right 2). Each dot represents differentially spliced probability predicted by CIRI‐deep for circRNA expressed in both samples. CF: conjunctival fibroblast; MEC: mammary endothelial cell; DCM: dilated cardiomyophathy; CESC: Cervical squamous cell carcinoma and endocervical adenocarcinoma. D) The performance of statistical inference combined with (Info model) or without (Flat model) CIRI‐deep in cervical cancer datasets. Ground truth are from sample pairs with replicates. P‐value was calculated using t‐test, ***p < 0.001. E) Performance of absolute value of Δ|psi|, flat model, CIRI‐deep only and info model in predicting circRNA events between sample pairs of different depth (5 M, 15 M, 25 M). P‐values was calculated using t‐test, *p < 0.05, **p < 0.01, ***p < 0.001. F) Performance of flat model and info model in 600 15 M sample pairs.
Figure 3
Figure 3
Tissue‐specific features contribute to tissue‐specific prediction. A) AUROC loss (%) of model trained by total RNA datasets with permuted cis and trans features. Splicing, translation, mRNA transport and RNA helicase related RBPs are collected from GO database. B) Workflow to identify significant tissue‐specific cis and trans features with adapted integrated gradients (AIG), taking central nervous system as example. C) IG values of top 15 cis features of common important contribution in prediction. D) IG values of top 50 RBPs (scaled by row) significant in 9 tissues. Part of tissue‐specific RBPs are labeled on right. E) Top 12 splicing‐related RBPs and cis features significant in central nervous system. Y‐axis represents IG value calculated from each group of baseline dots and target dots. F) Regulated circRNAs after Nova1 and Nova2 knocked out in mouse brain (left). Enrichment pattern of Nova1 and Nova2 surrounding up‐regulated, down‐regulated and unregulated circRNAs (right).
Figure 4
Figure 4
Prediction of differentially spliced circRNA between scRNA‐seq clusters. A) The heatmap is a multiclass confusion matrix of 3 categories: no difference, higher junction ratio in sample A and higher junction ratio in sample B. Each row represents circRNAs identified as no difference, higher in A or higher in B with gold standard total RNA‐seq datasets. Each column represents the percentage of circRNAs classified into three categories in each row with poly(A)‐derived reads or CIRI‐deepA, respectively. B) Number of circRNAs predicted to be up‐ or down‐regulated in tumor samples compared with pairwise control from TCGA datasets. CircRNAs are identified as up‐ or down‐regulated if their junction ratios are predicted to be higher in tumor or pairwise control in more than 35% sample pairs. C) Two high‐confidence differentially spliced events between glioma microglia and periphery microglia in Smart‐seq2 datasets (GSE84465). Cells with the circRNA detected are labeled, with size and color indicating number of back‐splicing reads and junction ratio of the circRNA in the cell. D) Prediction probability of 10 high‐confidence differentially spliced events between a certain cell‐type clusters in tumor and periphery tissue (p > 0.9 or p < 0.1, number of cells in tumor or periphery tissue > 10). E) Consistency (%) between statistical inference (DARTS BHT) from Smart‐seq2 data and model prediction from 10X scRNA‐seq data for circRNA events detected in different number of cells. F) Boxplot of prediction accuracy of 8 high‐confidence marker circRNAs. P‐value was calculated using t‐test, n = 8, *p < 0.05. G) Prediction of marker circRNA in 10X scRNA‐seq datasets. Smart‐seq2 glioma (GSE84465) and 10X glioma (GSE131928) datasets are merged and 4 common clusters (Myeloid, Neoplastic 1, Neoplastic 2 and OPC) are included. Marker circRNAs are tested through Wilcoxon rank test (p < 0.05, number of circRNA expressing the circRNA > 5) in Smart‐seq2 datasets, and clusters with higher marker circRNA junction ratio are highlighted by dashed line. Prediction of marker circRNAs is made to compare junction ratios between cells in each cluster and the rest cells in 10X datasets. Clusters predicted with higher junction ratio are highlighted.
Figure 5
Figure 5
Application in spatial transcriptomics. A) The fetal heart ST panel is split into 4 anatomical regions, taking tissue section 16 as example (EGAS00001003996). B) CIRI‐deepA predicted region specific circRNAs between different regions are enriched in bulk data (RNA Atlas) derived result (fisher exact test). Dot size and color indicate ratio and ‐log(pvalue). Random_label: randomly chosen circRNAs (same size of region specific circRNAs in bulk data). Random_RBP: prediction using randomly permutated RBP expression value as input. Random_cis: prediction using randomly permutated cis features as input. C) Workflow for calculating circRNA index for each region or spot. F denotes the CIRI‐deepA model. D) CircRNA relative region index plot of section 16 (left) validated by junction ratio in corresponding bulk data (right). Vena cava, atrium 1, atrium2, ventricle1 and ventricle 2 are corresponded to outflow tract/large vessels, atrium and ventricle regions, respectively. E) Normalized circRNA index of circQKI(2‐4) and normalized expression of QKI in section 1, 6 and 16. F) Workflow for fitting the cell‐type proportion with LASSO using prediction probability of 18 circRNA as input (probabilities of higher in sample A and higher in sample B). The 18 circRNAs are manually selected for low correlation of prediction probability. G) Predicted cell‐type abundance and deconvolution (CARD)‐derived cell‐type abundance in 4 samples. Pearson correlation coefficient is computed between output of Lthe LASSO model and deconvolution result.

Similar articles

Cited by

References

    1. a) Memczak S., Jens M., Elefsinioti A., Torti F., Krueger J., Rybak A., Maier L., Mackowiak S. D., Gregersen L. H., Munschauer M., Loewer A., Ziebold U., Landthaler M., Kocks C., le Noble F., Rajewsky N., Nature 2013, 495, 333; - PubMed
    2. b) Salzman J., Gawad C., Wang P. L., Lacayo N., Brown P. O., PLoS One 2012, 7, e30733. - PMC - PubMed
    1. a) Hansen T. B., Jensen T. I., Clausen B. H., Bramsen J. B., Finsen B., Damgaard C. K., Kjems J., Nature 2013, 495, 384; - PubMed
    2. b) Ji P., Wu W., Chen S., Zheng Y., Zhou L., Zhang J., Cheng H., Yan J., Zhang S., Yang P., Zhao F., Cell Rep. 2019, 26, 3444. - PubMed
    1. a) Du W. W., Yang W., Liu E., Yang Z., Dhaliwal P., Yang B. B., Nucleic Acids Res. 2016, 44, 2846; - PMC - PubMed
    2. b) Legnini I., Di Timoteo G., Rossi F., Morlando M., Briganti F., Sthandier O., Fatica A., Santini T., Andronache A., Wade M., Laneve P., Rajewsky N., Bozzoni I., Mol. Cell 2017, 66, 22. - PMC - PubMed
    1. a) Conn S. J., Pillman K. A., Toubia J., Conn V. M., Salmanidis M., Phillips C. A., Roslan S., Schreiber A. W., Gregory P. A., Goodall G. J., Cell 2015, 160, 1125; - PubMed
    2. b) Meng J., Chen S., Han J. X., Qian B., Wang X. R., Zhong W. L., Qin Y., Zhang H., Gao W. F., Lei Y. Y., Yang W., Yang L., Zhang C., Liu H. J., Liu Y. R., Zhou H. G., Sun T., Yang C., Cancer Res. 2018, 78, 4150; - PubMed
    3. c) Piwecka M., Glazar P., Hernandez‐Miranda L. R., Memczak S., Wolf S. A., Rybak‐Wolf A., Filipchyk A., Klironomos F., Cerda Jara C. A., Fenske P., Trimbuch T., Zywitza V., Plass M., Schreyer L., Ayoub S., Kocks C., Kuhn R., Rosenmund C., Birchmeier C., Rajewsky N., Science 2017, 357; - PubMed
    4. d) Chen L. L., Nat. Rev. Mol. Cell Biol. 2020, 21, 475. - PubMed
    1. Starke S., Jost I., Rossbach O., Schneider T., Schreiner S., Hung L. H., Bindereif A., Cell Rep. 2015, 10, 103. - PubMed

LinkOut - more resources