Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 3;120(1):e2210283120.
doi: 10.1073/pnas.2210283120. Epub 2022 Dec 28.

Robotic data acquisition with deep learning enables cell image-based prediction of transcriptomic phenotypes

Affiliations

Robotic data acquisition with deep learning enables cell image-based prediction of transcriptomic phenotypes

Jianshi Jin et al. Proc Natl Acad Sci U S A. .

Abstract

Single-cell whole-transcriptome analysis is the gold standard approach to identifying molecularly defined cell phenotypes. However, this approach cannot be used for dynamics measurements such as live-cell imaging. Here, we developed a multifunctional robot, the automated live imaging and cell picking system (ALPS) and used it to perform single-cell RNA sequencing for microscopically observed cells with multiple imaging modes. Using robotically obtained data that linked cell images and the whole transcriptome, we successfully predicted transcriptome-defined cell phenotypes in a noninvasive manner using cell image-based deep learning. This noninvasive approach opens a window to determine the live-cell whole transcriptome in real time. Moreover, this work, which is based on a data-driven approach, is a proof of concept for determining the transcriptome-defined phenotypes (i.e., not relying on specific genes) of any cell from cell images using a model trained on linked datasets.

Keywords: cell picking; deep learning; microscopy; robotics; single-cell RNA sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors have patent filings to disclose, RIKEN has co-filed a patent application based on this work.

Figures

Fig. 1.
Fig. 1.
ALPS. (A) Schematic of the ALPS (details in Materials and Methods). (B) The running time and success rate for the automated isolation of 96 T cells in a dish based on the cell density. Running time, total time from the start of step-(auto-1) of the first isolation to the end of step-(auto-7) of the 96th isolation. Success rate, the number of isolations, where one cell was picked and deposited, in each of the 96 automated isolations divided by 96. Cell density, average number of cells per mm2 of all scanned squares for each of the 96 automated isolations (see section “Automated Single-Cell Isolation Programs” in Materials and Methods). (C and D) Four images (as column), bright-field image (Top) and fluorescent images of GFP (green), PE (cyan), and Alexa647 (magenta) of each cell from 3mix-ALPS-random-multimode (C) or 3mix-ALPS-target-multimode (D). The numbers indicate the order of isolation. Image size, 26 μm × 26 μm. The contrast of the images shown in (C) and (D) changed linearly.
Fig. 2.
Fig. 2.
ALPS-linked cell images and whole transcriptome for the same cell. (A) t-SNE visualization of single cells from 3mix-ALPS-random&target-multimode clustered based on whole transcriptomes. Filled colors represent the cell types determined by fluorescent images; the gray color indicates that the cell type was undetermined due to low fluorescent intensities. The edge colors represent the cell types determined by the transcriptome. (B) t-SNE visualization of all PBMCs clustered based on whole transcriptomes. The colors represent the cell types determined by the transcriptome. The black circles and numbers correspond to the cells shown in (C). (C) Examples of bright-field time-lapse cell images of PBMCs (SI Appendix, Text 10). Image size, 20 μm × 20 μm. The contrast of the images shown in (C) changed linearly.
Fig. 3.
Fig. 3.
Deep learning–based classification for time-lapse imaged samples. (A) Accuracy of predicting the scRNA-seq–determined cell types of the cells from the 3mix-ALPS-timelapse using ResNet–LSTM. RNA-seq, cell types determined by scRNA-seq; Random, cell types labeled randomly; 16, 64, 256, and All cells, 16, 64, 256, and all cells were randomly selected from 15 plates for training (cells in one plate were used for testing). The P values were calculated using the Kruskal–Wallis rank sum test [in (B) and (C) as well]. (B) Accuracy of predicting scRNA-seq–determined cell states of HPCs (HPC-1 and HPC-2 in SI Appendix, Fig. S13A) using ResNet–LSTM. (C) Accuracy of predicting scRNA-seq–determined cell types of PBMCs using ResNet–LSTM. Two types, PBMCs were clustered into T cells and B cells. Three types, PBMCs were clustered into CD4+ T cells, CD8+ T cells, and B cells.
Fig. 4.
Fig. 4.
Deep learning–based regression for time-lapse imaged three-type–mixed cells (3mix-ALPS-timelapse). (A) Scatterplots of the measured and predicted RNA expression levels of the cells for the Gm1821 gene (log-transformed; SI Appendix, Text 6), including 16 predictions using different plates. Median R2, median of the out-of-sample R2 value for each prediction (SI Appendix, Fig. S18C); median Pearson’s r, median of the Pearson’s correlation coefficient r for each prediction (SI Appendix, Fig. S18D). Dashed line, diagonal line. (B) Distribution of the Euclidean distances between the predicted and measured expression levels of 300 genes for the same cell and between the measured expression levels of 300 genes of all possible pairs of all cells, same-type cells, and different-type cells. The orange lines indicate the means. The P values were calculated using the Kruskal–Wallis rank sum test. (C) PCA of the cells from 3mix-timelapse-plate5 with predicted (open circle) and measured (closed circle) expression levels of 300 genes (each cell has two dots) (data from the other 15 plates are shown in SI Appendix, Fig. S19). Colors correspond to the three different cell types determined by the measured whole transcriptome. Lines link the predicted and measured expression levels of the same cell.

References

    1. Tammela T., Sage J., Investigating tumor heterogeneity in mouse models. Annu. Rev. Cancer Biol. 4, 99–119 (2020). - PMC - PubMed
    1. Tanay A., Regev A., Scaling single-cell genomics from phenomenology to mechanism. Nature 541, 331–338 (2017). - PMC - PubMed
    1. Jones W., Alasoo K., Fishman D., Parts L., Computational biology: Deep learning. Emerg. Top. Life Sci. 1, 257–274 (2017). - PMC - PubMed
    1. Tang B., Pan Z., Yin K., Khateeb A., Recent advances of deep learning in bioinformatics and computational biology. Front. Genet. 10, 1–10 (2019). - PMC - PubMed
    1. Angermueller C., Pärnamaa T., Parts L., Stegle O., Deep learning for computational biology. Mol. Syst. Biol. 12, 878 (2016). - PMC - PubMed

Publication types