Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Sep 13:2024.03.03.583179.
doi: 10.1101/2024.03.03.583179.

Multi-condition and multi-modal temporal profile inference during mouse embryonic development

Affiliations

Multi-condition and multi-modal temporal profile inference during mouse embryonic development

Ran Zhang et al. bioRxiv. .

Update in

Abstract

The emergence of single-cell time-series datasets enables modeling of changes in various types of cellular profiles over time. However, due to the disruptive nature of single-cell measurements, it is impossible to capture the full temporal trajectory of a particular cell. Furthermore, single-cell profiles can be collected at mismatched time points across different conditions (e.g., sex, batch, disease) and data modalities (e.g., scRNA-seq, scATAC-seq), which makes modeling challenging. Here we propose a joint modeling framework, Sunbear, for integrating multi-condition and multi-modal single-cell profiles across time. Sunbear can be used to impute single-cell temporal profile changes, align multi-dataset and multi-modal profiles across time, and extrapolate single-cell profiles in a missing modality. We applied Sunbear to reveal sex-biased transcription during mouse embryonic development and predict dynamic relationships between epigenetic priming and transcription for cells in which multi-modal profiles are unavailable. Sunbear thus enables the projection of single-cell time-series snapshots to multi-modal and multi-condition views of cellular trajectories.

PubMed Disclaimer

Conflict of interest statement

Ethics declarations The authors declare that they have no conflict of interest.

Figures

Figure 1:
Figure 1:. Sunbear Framework.
(A) Sunbear takes as input a collection of measurements of single cells at multiple time points in two or more biological conditions (top) or data modalities (bottom). (B) SDuring the training phase, Sunbear learns to decompose the original time-series profiles into four components: cell identity, time point, batch and condition. Batch and condition factors are represented by one-hot encodings. The time factor is represented by a sinusoidal encoding. The cell identity factor is learned from the original profile and is conditionally independent of the other factors. In the multimodal setting, cell identities are aligned between data modalities. (C) In the prediction phase, Sunbear concatenates the query cell’s identity factor while varying other factors to impute the cell’s profile across time and condition. By transferring cell identity factors across modalities, Sunbear allows joint temporal modeling of multimodal profiles.
Figure 2:
Figure 2:. Single-cell profile inference across time and conditions.
(A) Sunbear is trained on scRNA-seq profiles of whole mouse embryos collected at alternating sexes along developmental time points [33]. (B) Sunbear is validated in three scenarios. In each scenario, one data block is held out from the training, and Sunbear is used to predict the profile of the missing block based on cells in the query block (outlined in turquoise). Sunbear’s prediction is compared against the baselines using the held-out block’s nearest existing measurements with the desired sex factors. Pseudobulk Pearson correlation per major cell trajectory is calculated between the held-out profile and predictions/baselines. (C) Cross-time evaluation: query and baseline are selected either from the closest previous time point (left) or the closest subsequent time point (right). Pseudobulk Pearson correlation between the original held-out profile and predicted (y-axis) and baselines (x-axis) are plotted for each major cell trajectory in each held-out time point. Each dot represents a cell trajectory per held-out time point, and numbers indicate the number of dots above and below the diagonal line. P-values are calculated by a one-sided Wilcoxon rank-sum test. (D) Cross-sex prediction: similar to (C), except query cells are selected from the opposite sex to the held-out data. (E) Cross-sex prediction: similar to (D), except we enforce a strict baseline model by taking the mean of the previous and subsequent time point per cell trajectory.
Figure 3:
Figure 3:. Sex differences in mouse embryonic development.
(A) Pairwise comparison of Sunbear prediction and the nearest neighbor baseline in recapitulating differential expression patterns in sex-matched time points. Each dot indicates the AUROC score of recapitulating female/male-biased patterns in each sex-matched time point and cell type. (B) Similar to A, pairwise comparison of Sunbear prediction and the nearest neighbor baseline in ranking escape genes to be more female-biased than all other genes on the X chromosome. (C,D) Predicted temporal sex-biased log fold change in glutamatergic neurons and border-associated macrophages. Each line represents a gene that is predicted to be consistently higher in females than males and is colored by whether the gene is a known constitutive escape gene or not. (E) Distribution of predicted sex-biased scores of genes (0 = extremely male-biased, 1 = extremely female-biased), grouped and colored by whether the gene is up- (pink) or down- (blue) regulated in Kdm6a KO vs. WT samples in CD4+ cells. P-values are calculated by one-sided Wilcoxon rank sum tests. (F) Gene Ontology biological processes enriched in consistently female and male-biased genes in border-associated macrophages. Non-redundant terms with the smallest FDR are selected for visualization. No enrichment of biological processes is found in glutamatergic neurons.
Figure 4:
Figure 4:. Multi-modal temporal inference.
(A) A UMAP embedding suggests that scRNA-seq and scATAC-seq profiles are well aligned across time and batch. Only time points with both scRNA-seq and scATAC-seq available are shown. (B) AUROC of the predicted differential accessibility pattern relative to those derived from the original datasets. AUROC is calculated per cell type, and differential accessibility is calculated between each held-out time point and each query time point (shown as “query time point → held-out time point”). (C) Peak-wise AUROC of scATAC-seq profiles predicted based on scRNA-seq relative to the original scATAC-seq profile in each held-out time point. AUROCs are calculated across all cells. (D) Workflow for calculating the dynamic association between peaks and genes. A query cell’s scRNA-seq profile is fed into Sunbear to predict temporal patterns of gene expression and chromatin accessibility. For each pair of chromatin region and its proximal gene, we calculate the correlation coefficient between them with incremental time shifts, which results in a TLCC vector (column). (E) Predicted peak region-gene relationships. Heatmap of TLCC matrices on randomly selected 5000 peak regions with accessibility changes ahead (“before”) or subsequent to (“after”) nearby gene expression. Peak regions are sorted based on the time shift with the maximum TLCC.

References

    1. Argelaguet R., Lohoff T., Li J. G., Nakhuda A., Drage D., Krueger F., Velten L., Clark S. J., and Reik W.. Decoding gene regulation in the mouse embryo using single-cell multi-omics. bioRxiv, pages 2022–06, 2022.
    1. Ashuach T., Reidenbach D. A., Gayoso A., and Yosef N.. PeakVI: A deep generative model for single cell chromatin accessibility analysis. bioRxiv, 2021. - PMC - PubMed
    1. Berletch J. B., Ma W., Yang F., Shendure J., Noble W. S., and Disteche C. M.. Escape from X inactivation varies in mouse tissues. PLOS Genetics, 18(3):e1005079, 2015. - PMC - PubMed
    1. Bernstein B. E., Mikkelsen T. S., Xie X., Kamal M., Huebert D. J., Cuff J., Fry B., Meissner A., Wernig M., Plath K., Jaenisch R., Wagschal A., Feil R., Schreiber S. L., and Lander E. S.. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell, 125(2):315–326, 2006. - PubMed
    1. Borsari Beatrice, Frank Mor, Wattenberg Eve S, Xu Ke, Liu Susanna X, Yu Xuezhu, and Gerstein Mark. chronode: A framework to integrate time-series multi-omics data based on ordinary differential equations combined with machine learning. bioRxiv, pages 2023–12, 2023.

Publication types