Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Oct 14:arXiv:2409.11654v2.

How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities

Affiliations

How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities

Charlotte Bunne et al. ArXiv. .

Update in

  • How to build the virtual cell with artificial intelligence: Priorities and opportunities.
    Bunne C, Roohani Y, Rosen Y, Gupta A, Zhang X, Roed M, Alexandrov T, AlQuraishi M, Brennan P, Burkhardt DB, Califano A, Cool J, Dernburg AF, Ewing K, Fox EB, Haury M, Herr AE, Horvitz E, Hsu PD, Jain V, Johnson GR, Kalil T, Kelley DR, Kelley SO, Kreshuk A, Mitchison T, Otte S, Shendure J, Sofroniew NJ, Theis F, Theodoris CV, Upadhyayula S, Valer M, Wang B, Xing E, Yeung-Levy S, Zitnik M, Karaletsos T, Regev A, Lundberg E, Leskovec J, Quake SR. Bunne C, et al. Cell. 2024 Dec 12;187(25):7045-7063. doi: 10.1016/j.cell.2024.11.015. Cell. 2024. PMID: 39672099 Free PMC article.

Abstract

The cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells. Here we propose a vision of leveraging advances in AI to construct virtual cells, high-fidelity simulations of cells and cellular systems under different conditions that are directly learned from biological data across measurements and scales. We discuss desired capabilities of such AI Virtual Cells, including generating universal representations of biological entities across scales, and facilitating interpretable in silico experiments to predict and understand their behavior using Virtual Instruments. We further address the challenges, opportunities and requirements to realize this vision including data needs, evaluation strategies, and community standards and engagement to ensure biological accuracy and broad utility. We envision a future where AI Virtual Cells help identify new drug targets, predict cellular responses to perturbations, as well as scale hypothesis exploration. With open science collaborations across the biomedical ecosystem that includes academia, philanthropy, and the biopharma and AI industries, a comprehensive predictive understanding of cell mechanisms and interactions has come into reach.

PubMed Disclaimer

Conflict of interest statement

Competing interests C.B. and A. R. are employees of Genentech, a member of the Roche Group. A.R. has equity in Roche. A.R. was a co-founder and equity holder of Celsius Therapeutics, and is an equity holder in Immunitas. Until July 31, 2020 A.R. was an S.A.B. member of ThermoFisher Scientific, Syros Pharmaceuticals, Neogene Therapeutics and Asimov. A.R. is a named inventor on multiple filed patents related to single cell and spatial genomics, including for scRNA-seq, spatial transcriptomics, Perturb-Seq, compressed experiments, and PerturbView. E.L. is an advisor for the Chan-Zuckerberg Initiative Foundation. N.J.S. is an employee of EvolutionaryScale, PBC.

Figures

Figure 1:
Figure 1:. Capabilities of the AI Virtual Cell.
a. The AI Virtual Cell provides a Universal Representation of a cell state that can be obtained across species and conditions, and generated from different data modalities across scales (molecular, cellular, multicellular). b. The AI Virtual Cell possesses capabilities to represent and predict cell biology. This universality allows the representation to act as a reference that can generalize to previously unobserved cell states, providing guidance for future data generation. Since the representation is shared across modalities, it also remains invariant to the specific data type used to generate it, serving as a virtual representation for unified analysis across modalities. The AI Virtual Cell also allows modeling the dynamics of cells as they transition between different states, whether naturally due to processes such as differentiation or due to genetic variation or artificially through engineered perturbations. Thus, the AI Virtual Cell enables in silico experimentation that would otherwise be cost-prohibitive or impossible in a lab. c. The utility of the AI Virtual Cell depends on its interactions with humans at different levels. At the individual scientist level, it must be accessible through open licenses and the democratization of computing resources. Interpretability can be established through intermediary layers such as language models that allow the virtual cell to communicate its results effectively. At the scientific community level, evaluating the AI Virtual Cell should focus on core capabilities that move beyond narrow benchmarks. Community development will be crucial for ongoing improvements to the virtual cell that remain accessible. At the societal level, the AI Virtual Cell must ensure the privacy of its contents to protect sensitive data.
Figure 2:
Figure 2:. Overview of the AI Virtual Cell.
a. Similar to biological cells, b. the AI Virtual Cell models cell biology across different physical scales, including molecular, cellular, and multicellular. Along the physical dimension, the first scale models the state and interactions of individual molecules, such as those of the central dogma, as well as additional molecules like metabolites. Molecules can be represented as sequences or atomic structures. The next scale represents cells as collections of these molecules. For example, such cells contain a genetic sequence, RNA transcripts and some quantities of proteins. Molecules within cells have specific locations that may be related to their function. The final scale models the interactions between cells, how they communicate and form complex tissues. Each scale relies on Universal Representations that are learned from multi-modal data and are integrating URs from the previous scale. c. To capture the behavior and dynamics of physical cells, its components, or collections, d. the AI Virtual Cell comprises Virtual Instruments. On the cellular scale, for example, Manipulator VIs simulate how cell states change as cells divide, migrate, develop from progenitor states, or respond to perturbations through learned transitions in the URs. Decoder VIs allow to decode the cell UR, e.g., to understand phenotypic properties.

References

    1. Slepchenko B. M., Schaff J. C., Macara I. & Loew L. M. Quantitative cell biology with the Virtual Cell. Trends in Cell Biology 13 (2003). - PubMed
    1. Johnson G. T. et al. Building the next generation of virtual cells to understand cellular biology. Biophysical Journal 122 (2023). - PMC - PubMed
    1. Marx V. How to build a virtual embryo. Nature Methods 20 (2023). - PubMed
    1. Goldberg A. P. et al. Emerging whole-cell modeling principles and methods. Current Opinion in Biotechnology 51 (2018). - PMC - PubMed
    1. Georgouli K., Yeom J.-S., Blake R. C. & Navid A. Multi-scale models of whole cells: progress and challenges. Frontiers in Cell and Developmental Biology 11 (2023). - PMC - PubMed

Publication types

LinkOut - more resources