Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 12;8(1):33.
doi: 10.1038/s41540-022-00245-6.

sciCAN: single-cell chromatin accessibility and gene expression data integration via cycle-consistent adversarial network

Affiliations

sciCAN: single-cell chromatin accessibility and gene expression data integration via cycle-consistent adversarial network

Yang Xu et al. NPJ Syst Biol Appl. .

Abstract

The boom in single-cell technologies has brought a surge of high dimensional data that come from different sources and represent cellular systems from different views. With advances in these single-cell technologies, integrating single-cell data across modalities arises as a new computational challenge. Here, we present an adversarial approach, sciCAN, to integrate single-cell chromatin accessibility and gene expression data in an unsupervised manner. We benchmarked sciCAN with 5 existing methods in 5 scATAC-seq/scRNA-seq datasets, and we demonstrated that our method dealt with data integration with consistent performance across datasets and better balance of mutual transferring between modalities than the other 5 existing methods. We further applied sciCAN to 10X Multiome data and confirmed that the integrated representation preserves biological relationships within the hematopoietic hierarchy. Finally, we investigated CRISPR-perturbed single-cell K562 ATAC-seq and RNA-seq data to identify cells with related responses to different perturbations in these different modalities.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of sciCAN and potential applications.
a sciCAN model architecture. sciCAN contains two major components: representation learning and modality alignment. The representation learning part of the model is highlighted in the red box, and the modality alignment part in the purple box. Inputs of scATAC-seq and scRNA-seq have been preprocessed to have the same feature dimensions, so they can share one single encoder E. The final total loss (L) is the sum of loss of representation learning in red and loss of modality alignment in purple. Of note, calculation of NCE is independent for scATAC-seq and scRNA-seq data. b Downstream integrative analyses can include but are not limited to co-embedding, co-trajectory, and label transferring.
Fig. 2
Fig. 2. Benchmarking of sciCAN against other 5 existing integration methods.
a Integration evaluation by modality and cell-type silhouette scores across 5 datasets. x axis corresponds to modality silhouette score and y axis to cell-type silhouette score. Ideal integration should be located in the top right corner of each dot plot. To generate the dot plot, we randomly subsample 20% of the cell population to calculate both modality and cell-type silhouette scores for each method and each dataset. b Integration evaluation by F1 scores across 5 datasets. Upper panel corresponds to label transferring from RNA-seq to ATAC-seq (RtoA) while lower panel indicates label transferring from ATAC-seq to RNA-seq (AtoR). Boxplots are plotted based on F1 scores for all cell types in that dataset. The median value is marked with a horizontal line within the box, and the “X” mark represents the macro F1 score, which is the average of F1 scores for all cell types. Whiskers show minimum and maximum value and top and bottom of the box show 25th and 75th percentile respectively. c Benchmark ranking across 5 datasets. In each category, methods are ranked based on their scores from best (red, low number ranking) to worst (blue, high number ranking).
Fig. 3
Fig. 3. Integration learned by sciCAN preserves hematopoietic hierarchy.
a Co-trajectory analysis via PAGA using joint representation learned by sciCAN. Each dot is the sum of all cells annotated as the same cell type. Trajectory is visualized using RNA-seq (upper panel) and ATAC-seq (lower panel), separately. b Enrichments of signature genes for 3 different lineages using both RNA-seq (top) and ATAC-seq (bottom) data. Colorbar indicates gene expression (top) or gene activity level (bottom), respectively.
Fig. 4
Fig. 4. sciCAN identifies common response after CRISPR perturbation.
a Visualization of single-cell CRISPR-perturbed K562 RNA-seq and ATAC-seq data via UMAP. Cells are colored by identified cell clusters (left) and modality source (right). b Spearman correlation between RNA-seq and ATAC-seq profiles of cells in different clusters in both modalities. Gene expression or gene activity matrix was averaged by cell clusters. c Shared gene signatures of the 3 cell clusters in both modalities. Differential gene activities or expression were identified through ‘Wilcoxon’ test in Scanpy package. d Ranking of sgRNA representation in each cluster (blue = C0, orange = C1, green = C2) in both RNA-seq (left) and ATAC-seq (right) data. Genes perturbed in both experiments are highlighted. e Gene signatures of cells targeted by sgELE1, sgYY1, and sgGABPA in cell cluster 1. f Genes whose activity patterns distinguish cells in cluster 0 and cluster 2 among cells in these clusters perturbed by the same gRNAs.

References

    1. Macaulay IC, Ponting CP, Voet T. Single-Cell Multiomics: Multiple Measurements from Single Cells. Trends Genet. 2017;33:155–168. doi: 10.1016/j.tig.2016.12.003. - DOI - PMC - PubMed
    1. Stuart T, Satija R. Integrative single-cell analysis. Nat. Rev. Genet. 2019;20:257–272. doi: 10.1038/s41576-019-0093-7. - DOI - PubMed
    1. Carter B, Zhao K. The epigenetic basis of cellular heterogeneity. Nat. Rev. Genet. 2021;22:235–250. doi: 10.1038/s41576-020-00300-0. - DOI - PMC - PubMed
    1. Kelsey G, Stegle O, Reik W. Single-cell epigenomics: Recording the past and predicting the future. Science. 2017;358:69–75. doi: 10.1126/science.aan6826. - DOI - PubMed
    1. Wagner DE, Klein AM. Lineage tracing meets single-cell omics: opportunities and challenges. Nat. Rev. Genet. 2020;21:410–427. doi: 10.1038/s41576-020-0223-2. - DOI - PMC - PubMed

Publication types