Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Dec;28(12):1030-1048.
doi: 10.1016/j.tcb.2018.09.002. Epub 2018 Oct 8.

Towards a Quantitative Understanding of Cell Identity

Affiliations
Review

Towards a Quantitative Understanding of Cell Identity

Zi Ye et al. Trends Cell Biol. 2018 Dec.

Abstract

Cells have traditionally been characterized using expression levels of a few proteins that are thought to specify phenotype. This requires a priori selection of proteins, which can introduce descriptor bias, and neglects the wealth of additional molecular information nested within each cell in a population, which often makes these sparse descriptors qualitative. Recently, more unbiased and quantitative cell characterization has been made possible by new high-throughput, information-dense experimental approaches and data-driven computational methods. This review discusses such quantitative descriptors in the context of three central concepts of cell identity: definition, creation, and stability. Collectively, these concepts are essential for constructing quantitative phenotypic landscapes, which will enhance our understanding of cell biology and facilitate cell engineering for research and clinical applications.

Keywords: cell phenotype; cellular decision making; computational modeling; high-throughput data analysis; network biology; phenotypic landscape.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Construction of temporal cell trajectories using mechanistic and data-driven models
(A) Flowchart for construction of mechanistic models of a cellular process. (B) Example of erythroid lineage-commitment model with two non-cooperative positive feedback loops from active transcription factor GATA1*, creating more of its inactive self (GATA1) and more erythropoietin receptor (EpoR). The solid arrows represent binding/activation steps and the dashed arrows represent upregulation via protein synthesis. This network creates bistability with respect to erythropoietin (Epo) concentration, enabling robust binary decision making. (C) A phase plane showing the trajectories of nine cells, each with different initial concentrations of GATA1 and EpoR. Cells with sufficiently high concentrations of GATA1 and/or EpoR can differentiate along the shown trajectories to a committed state in the top right corner; by contrast, cells with sub-threshold levels of these critical factors peter out and are unable to commit. (D) The specific application of part A to the erythroid lineage-commitment model in parts B and C. (E) Flowchart for construction of data-driven models of a cellular process. The pre-processing step is sometimes incorporated in the algorithm; if not, the data should be normalized and filtered as appropriate for the specific data and application. During graphing, clustering is an optional step. Some models perform dimension reduction before trajectory plotting, while others perform these two steps simultaneously. The sub-steps listed under each category are not necessarily in sequential order; the actual order is determined by the specific algorithm (see Table 2). (F) A data-driven model typically uses a large dataset to construct an interaction network that is implicated in cell conversion, but the linkages and network structure are generally based on correlation, not biological mechanism. (G) A trajectory of differentiating human skeletal muscle myoblasts generated using Slingshot. The dimension reduction method used is principal component analysis (PCA) and the axis are the primary principal component (PC1) and secondary principal component (PC2). (H) The specific application of part E to myoblast differentiation using Slingshot. The data are visualized in two dimensions using k-means clustering and PCA, and the trajectory is overlaid on these data (light blue line in part G) using a minimum spanning tree (MST).
Figure 2.
Figure 2.. Steps for generating a landscape model.
(A) The selection of key elements is an optional step that is commonly used to save computational power and reduce noise. The z-axis, which is the cell stability or potential, is the key calculation that distinguishes different models (see Table 3). For dimension reduction, principal component analysis (PCA), independent component analysis (ICA) and multidimensional scaling (MDS) are commonly used linear methods while t-distributed stochastic neighbor embedding (t-SNE) and diffusion maps are popular non-linear methods. (B) In this depiction of a cell phenotype landscape, the z-axis represents cell stability or potential (see Table 3). A lower z-value corresponds to greater stability, with local minima representing stable (or metastable) states. The x-y plane represents cell identity, as quantified by algorithms based on epigenomic, transcriptomic, and/or proteomic profiles (see Table 1), and changes in position along the landscape are quantified by trajectory models (see Table 2).

References

    1. Mazzarello P A unifying concept: the history of cell theory. Nat Cell Biol. 1999;1:E13. - PubMed
    1. Gall JG, McIntosh JR, editors. Landmark Papers in Cell Biology: Selected Research Articles Celebrating Forty Years of The American Society for Cell Biology. Cold Spring Harbor, NY : Bethesda, MD: : American Society for Cell Biology: Cold Spring Harbor Laboratory Pr; 2000.
    1. Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S. Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature. 2008;453:544–7. - PMC - PubMed
    1. Schmidl C, Hansmann L, Lassmann T, Balwierz PJ, Kawaji H, Itoh M, et al. The enhancer and promoter landscape of human regulatory and conventional T-cell subpopulations. Blood. 2014;123:e68–78. - PubMed
    1. Fujii H, Josse J, Tanioka M, Miyachi Y, Husson F, Ono M. Regulatory T Cells in Melanoma Revisited by a Computational Clustering of FOXP3+ T Cell Subpopulations. J Immunol. 2016;196:2885–92. - PMC - PubMed

Publication types