Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct 23;8(1):1096.
doi: 10.1038/s41467-017-01076-4.

Dynamics of lineage commitment revealed by single-cell transcriptomics of differentiating embryonic stem cells

Affiliations

Dynamics of lineage commitment revealed by single-cell transcriptomics of differentiating embryonic stem cells

Stefan Semrau et al. Nat Commun. .

Abstract

Gene expression heterogeneity in the pluripotent state of mouse embryonic stem cells (mESCs) has been increasingly well-characterized. In contrast, exit from pluripotency and lineage commitment have not been studied systematically at the single-cell level. Here we measure the gene expression dynamics of retinoic acid driven mESC differentiation from pluripotency to lineage commitment, using an unbiased single-cell transcriptomics approach. We find that the exit from pluripotency marks the start of a lineage transition as well as a transient phase of increased susceptibility to lineage specifying signals. Our study reveals several transcriptional signatures of this phase, including a sharp increase of gene expression variability and sequential expression of two classes of transcriptional regulators. In summary, we provide a comprehensive analysis of the exit from pluripotency and lineage commitment at the single cell level, a potential stepping stone to improved lineage manipulation through timing of differentiation cues.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Fig. 1
Fig. 1
Single-cell RNA-seq revealed an RA driven lineage transition of mESCs towards ectoderm- and XEN-like cells. a Scheme of the differentiation protocol with phase contrast images of cells growing in 2i/L (0 h) and after 96 h of exposure to 0.25 µM RA in N2B27 medium. b Principal component analysis of single-cell expression profiles of mESCs and cells after 96 h of RA exposure. Principal components were calculated across all cells and time points. Cells were placed in the space of the first two principal components (PC 1 and PC 2). Each data point corresponds to a single cell. Two robust clusters identified by k-means clustering and stability analysis are shown in red (ectoderm) and blue (XEN), respectively. mESCs are shown in orange. c t-SNE mapping of single-cell expression profiles. The single-cell RNA-seq data (SCRB-seq) for all cells and time points were mapped on a one-dimensional t-SNE space, which preserved local similarity between expression profiles, while reducing dimensionality. Each data point corresponds to a single cell. Data points for individual time points are shown in violin plots to reflect relative frequency along the t-SNE axis. The color of each data point indicates Rex1 expression (relative to maximum expression across all cells). For the 96 h time point, two robust clusters (found by k-means clustering and stability analysis) are indicated with red or blue edges, respectively. d Single-cell gene expression variability quantified as the variance over the mean (Fano factor). The Fano factor was calculated either for the whole population or subpopulations of cells defined by k-means clustering using 2,3 or 4 clusters. Clustering was carried out repeatedly and the Fano factors obtained for separate clusterings were averaged
Fig. 2
Fig. 2
mESCs showed gradual adoption and divergence of lineage specific expression profiles. a Principal component analysis of single-cell expression profiles. Principal components were calculated across all cells and time points. Cells measured at the indicated periods of RA exposure were placed in the space of the first two principal components. Each data point corresponds to a single cell. Cells were classified as mESC-like (orange), ectoderm-like (red) and XEN-like (blue). Classification was based on Pearson correlation between expression profiles of individual cells and mean expression profiles of mESCs at 0 h or ectoderm-like and XEN-like cells after 96 h of RA exposure. An individual cell is identified with the cell type with which it is most strongly correlated. b Same data as in a for three select time points (24 h, 36 h and 48 h), zoomed in on the areas indicated by dashed rectangles in a. c Relative frequencies of cells classified as mESC-like, ectoderm- or XEN-like in the same way as in a. d Average movement of ectoderm- and XEN-like cells in the principal component space during RA differentiation. The positions of cells of the same type were averaged at the indicated time points
Fig. 3
Fig. 3
Pseudo-temporal ordering of SCRB-seq data revealed gene expression dynamics around the exit from pluripotency. a Classification of all cells measured during the differentiation time course. Cells were classified according to the correlation of their expression profiles with the average expression of mESCs at 0 h or ectoderm- or XEN-like cells at 96 h. Cells were placed in the space of the first two principal components (PC1 and PC2). b Pseudo-time of cells in the ectoderm-like branch (mESC-like cells and ectoderm-like cells, left) or the XEN-like branch (mESC-like cells and XEN-like cells, right). Pseudo-time τ, which is indicated by color, is defined as τ = Rpluri −0.5*(Rect + Rxen) where Rpluri, Rect and Rxen are the Pearson correlations of an individual expression profile with the average expression of mESCs, ectoderm-like cells at 96 h and XEN-like cells at 96 h, respectively. c Relative frequencies of the three classes of cells with respect to pseudo-time. d Expression of a panel of marker genes for pluripotency, post-implantation epiblast, neuroectoderm and primitive endoderm with respect to pseudo-time. Cells were ordered by increasing pseudo-time and expression was averaged over 50 consecutive cells. Expression is presented as gene-wise z-score to accentuate temporal differences. e Average expression of the 4 sets of marker genes shown in d: pluripotency factors, post-implantation epiblast markers, neuroectoderm markers and extraembryonic endoderm markers
Fig. 4
Fig. 4
Differentiation with RA differed from the pathway observed in in vivo development. a Principal component analysis of a panel of pre- / peri-implantation tissues. The SCRB-seq gene expression profiles obtained during RA differentiation were placed in the space of the first two principal components. Each data point represents an individual cell. Data points are colored according to duration of RA exposure and cell type (at 96 h). ICM: inner cell mass, EPI: epiblast, PrE: primitive endoderm. b Expression of pluripotency, post-implantation epiblast and primitive endoderm marker genes in subpopulations defined by CD24 and PDGFRA expression. Cells were sorted on PDGFRA and CD24 antibody staining by FACS before RNA extraction at the indicated periods of RA exposure (48 h, 72 h and 96 h) and bulk RNA-seq. At 48 h, PDGFRA- cells were sorted by quartiles of CD24 expression, at 72 h cells were sorted by PDGFRA expression and terciles of CD24 expression. Expression of post-implantation epiblast markers at 48 h is highlighted with a dashed white box. c Identity of bulk RNA-seq samples as determined by the KeyGenes algorithm. A panel of pre- / peri-implantation tissues was used as the training set. A high identity (id) score corresponds to a high confidence about tissue identity. (MOR: morula, ICM: inner cell mass, EPI: epiblast, PrE: primitive endoderm). The identity scores for the epiblast tissues at 48 h are highlighted by a white dashed box
Fig. 5
Fig. 5
Susceptibility to signaling inputs was highly dynamic around the exit from pluripotency. ac Fractions of cells classified as XEN-like, ectoderm-like, double positive and double negative after 96 h, based on CD24 and PDGFRA expression. Expression of the two markers was measured by antibody staining and flow cytometry. a Cells were pulsed with 0.25 µM RA for x h (pulse) and subsequently differentiated in basal medium (N2B27) complemented with an RA receptor antagonist. b Cells were incubated with 0.25 µM RA for x h (pulse) after which 0.5 µM PD0325901 (MEK inhibitor) was added for the remainder of the time course. c Cells were first incubated with basal medium (N2B27) for x h (delay) and then exposed to 0.25 µM RA for the remainder of the time course. d Schematic representation of a minimal gene regulatory network that can model a lineage decision. Pointy arrows indicate (auto-)activation; blunted arrows indicate repression. E and X represent expression of ectoderm-like and XEN-like transcriptional programs, respectively. P stands for the pluripotency network. RA increases the auto-activation of the XEN program. e Results of the stochastic simulations of the network shown in d. The relative frequency of XEN-like cells after 96 h is shown vs. the length of an RA pulse (solid lines) or the length of the delay before RA exposure is started (dashed lines). In all cases the pluripotency network was turned off after 12 h. Simulations were run with different amounts of gene expression noise (D: noise power / time, see Methods). See Supplementary Fig. 8b for exemplary trajectories
Fig. 6
Fig. 6
Distinct co-expression and correlation patterns identified two classes of lineage specific transcriptional regulators. a Expression of transcriptional regulators in ectoderm-like and XEN-like cells identified in the SMART-seq2 data set. Genes that were significantly differentially expressed after 48 h of RA exposure are shown in red or pink (overexpressed in ectoderm-like cells) and blue or cyan (overexpressed in XEN-like cells), respectively. The two panels contain genes, which are present in the pluripotent state (early, left panel) or absent in the pluripotent state (late, right panel). A list of all identified genes is given in Supplementary Fig. 9b. b Co-expression of transcriptional regulators in the pluripotent state. The gene set comprised the differentially expressed transcriptional regulators identified here (see a), as well as pluripotency related transcription factors (see Supplementary Fig. 9b). Co-expression was calculated using gene expression measured by SMART-seq2. Co-expression of two genes was quantified as the fraction of cells in which the expression of both genes exceeded a certain threshold value (see Methods). c Co-expression network in the pluripotent state. Two genes are connected by an edge if their co-expression exceeds 0.8. The gene set comprised XEN specific regulators (cyan nodes) and ectoderm specific regulators (pink nodes) that are expressed in the pluripotent state (early factors), as well as pluripotency factors (orange nodes). The radius of solid nodes is proportional to the number of connections to other nodes. Nodes without any connections are depicted as open nodes. d Pearson correlation between transcriptional regulators after 48 h of RA exposure. The gene set is the same as in b. Pearson correlation was calculated using gene expression measured by SMART-seq2
Fig. 7
Fig. 7
smFISH confirmed distinct expression patterns of exemplary transcription factors. a Fluorescence images of smFISH for Gbx2 and Tbx3 in mESCs (0 h) and after 72 h RA exposure. Each diffraction limited dot corresponds to a single mRNA molecule. Hoechst staining of nuclei is shown in blue. b Fluorescence images of smFISH for Pax6 and Gata6 in mESCs (0 h) and after 96 h RA exposure. Each diffraction limited dot corresponds to a single mRNA molecule. Hoechst staining of nuclei is shown in blue. c Scatter plots of the number of Gbx2 and Tbx3 mRNAs per cell measured by smFISH. Each data point is a single cell. Color indicates the local density of data points. The number of shown cells measured at a certain time point ranges between 224 and 983. d Scatter plots of the number of Pax6 and Gata6 mRNAs per cell measured by smFISH. Each data point is a single cell. Color indicates the local density of data points. The number of shown cells measured at a certain time point ranges between 293 and 570. e Distribution of the Tbx3 and Gbx2 transcripts in individual mESCs as measured by smFISH. Both data sets are fit by a Gamma distribution (Tbx3, R2 = 0.94, solid blue line; Gbx2, R2 = 0.99, solid red line). f Scatter plots of the number of mRNAs per cell for Gbx2 and Tbx3 vs Pax6 measured by smFISH. Each data point is a single cell. Cells were exposed to RA for 12 h, 24 h, 48 h and 72 h, respectively, as indicated above each column of panels. The number in each panel is the Pearson correlation between the genes plotted in the respective panel
Fig. 8
Fig. 8
Transcriptional signatures of the lineage decision phase

References

    1. Cohen DE, Melton D. Turning straw into gold: directing cell fate for regenerative medicine. Nat. Rev. Genet. 2011;12:243–252. doi: 10.1038/nrg2938. - DOI - PubMed
    1. Soldner F, Jaenisch R. iPSC Disease Modeling. Science. 2012;338:1155–1156. doi: 10.1126/science.1227682. - DOI - PubMed
    1. Tabar V, Studer L. Pluripotent stem cells in regenerative medicine: challenges and recent progress. Nat. Rev. Genet. 2014;15:82–92. doi: 10.1038/nrg3563. - DOI - PMC - PubMed
    1. Semrau S, van Oudenaarden A. Studying lineage decision-making in vitro: emerging concepts and novel tools. Annu. Rev. Cell. Dev. Biol. 2015;31:317–345. doi: 10.1146/annurev-cellbio-100814-125300. - DOI - PubMed
    1. Balázsi G, van Oudenaarden A, Collins JJ. Cellular decision making and biological noise: from microbes to mammals. Cell. 2011;144:910–925. doi: 10.1016/j.cell.2011.01.030. - DOI - PMC - PubMed

Publication types