Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 2;551(7678):100-104.
doi: 10.1038/nature24454. Epub 2017 Oct 25.

Single-cell transcriptomics reconstructs fate conversion from fibroblast to cardiomyocyte

Affiliations

Single-cell transcriptomics reconstructs fate conversion from fibroblast to cardiomyocyte

Ziqing Liu et al. Nature. .

Abstract

Direct lineage conversion offers a new strategy for tissue regeneration and disease modelling. Despite recent success in directly reprogramming fibroblasts into various cell types, the precise changes that occur as fibroblasts progressively convert to the target cell fates remain unclear. The inherent heterogeneity and asynchronous nature of the reprogramming process renders it difficult to study this process using bulk genomic techniques. Here we used single-cell RNA sequencing to overcome this limitation and analysed global transcriptome changes at early stages during the reprogramming of mouse fibroblasts into induced cardiomyocytes (iCMs). Using unsupervised dimensionality reduction and clustering algorithms, we identified molecularly distinct subpopulations of cells during reprogramming. We also constructed routes of iCM formation, and delineated the relationship between cell proliferation and iCM induction. Further analysis of global gene expression changes during reprogramming revealed unexpected downregulation of factors involved in mRNA processing and splicing. Detailed functional analysis of the top candidate splicing factor, Ptbp1, revealed that it is a critical barrier for the acquisition of cardiomyocyte-specific splicing patterns in fibroblasts. Concomitantly, Ptbp1 depletion promoted cardiac transcriptome acquisition and increased iCM reprogramming efficiency. Additional quantitative analysis of our dataset revealed a strong correlation between the expression of each reprogramming factor and the progress of individual cells through the reprogramming process, and led to the discovery of new surface markers for the enrichment of iCMs. In summary, our single-cell transcriptomics approaches enabled us to reconstruct the reprogramming trajectory and to uncover intermediate cell populations, gene pathways and regulators involved in iCM induction.

PubMed Disclaimer

Conflict of interest statement

Author Information

The authors declare no competing financial interest. Readers are welcome to comment on the online version of the paper.

Figures

Extended Data Figure 1
Extended Data Figure 1. Experimental design, analysis pipeline, quality control and normalization for single-cell RNA-seq
(a) Experimental workflow. Hearts were isolated from P1.5 neonatal mice and cells were dissociated by enzymatic digestion. Thy1+ cells were then purified by MACS and plated overnight. The adherent cells (CF) were then transduced with retroviruses encoding the reprogramming factors M, G, and T or DsRed, or left untransduced (Mock). On day 3 post transduction, cells were trypsinized and live/dead stained. Additionally, for some experiments designed to examine the relative mouse RNA abundance in cells receiving different treatment, Mock/M+G+T cells were labeled with a green cell tracker CFSE, FACS-sorted for live cells, and then mixed at a designated ratio with FACS-sorted live DsRed+ cells from a parallel DsRed transduction. The single cell suspension was loaded onto a medium size chip (10-17 μm) and single cells were captured on a Fluidigm C1 machine. Brightfield and for some experiments, fluorescent images, were taken for all capture sites. Individual cDNA libraries for each cell were prepared in situ by RT with pre-amplification after adding RNA spike-in. Brightfield and/or fluorescent images for each capture site were examined and libraries from nests with 0 or multiple cells were excluded from downstream analysis. Illumina libraries were then prepared for each cell, pooled, quality-checked and sequenced on Hiseq 2500. (b) Design of the seven independent single-cell RNA-seq experiments including treatment, RNA spike-ins, and Fluidigm chips used. (c) Data analysis pipeline. Barcodes were trimmed off from RNA-seq raw reads and the quality of these reads was confirmed with fastqc. High quality reads were mapped to the mm10 genome with Tophat2 and counted with Htseq-count. Outliers were removed as described in (d). The raw counts were normalized first to technical and biological size factors within each experiment using DEseq and then to expt size factors calculated based on relative mouse mRNA abundance in cells receiving different treatments (h). Residual batch effects between experiments receiving the same treatment were removed using ComBat. Cell grouping and modeling were then performed using the normalized gene counts with PCA, HC, SLICER and more. The most important three quality control steps were labeled in red in (a) and (c). The above strict quality control criteria ensured that only high-quality and biologically meaningful data from healthy single cells were analyzed. (d) For each of the seven single cell experiments, percentage of reads mapped to spike-in in each cell was plotted against percentage of reads mapped to mouse genome (left panel) or mouse mRNA (right panel) in that cell. Cells outside of the red circles were outliers. (e) For each of the five single cell experiments that contained ERCC spike-in, average count numbers of each ERCC spike-in was plotted against their concentration in the lysis mix A (see Fluidigm’s protocol for details). Linear regression coefficients (R value) and their corresponding p values (two-sided, α=0.05) are shown. The results showed a dynamic range (~105) of ERCC concentration that covers the full spectrum of mouse gene expression levels. The high R values indicate strong correlation between hypothetical molecular concentrations and measured gene counts in our experiments. (f) Squared coefficients of variation (CV2) were plotted against average expression of ERCC spike-ins (left) or mouse genes (right) for experiments containing ERCC spikein. (g) DsRed counts in expt E3 and E4-E7 plotted against Mef2c counts and/or total mouse mRNA counts after normalization to technical and biological size factors within each experiment (Methods). Cells in in the four experiments were classified as DsRed-transduced (E3R, E5R, E6R, E7R), M+G+T-transduced (E5M, E6M), or untransduced cells (E3U, E7U) based on these plots. (h-i) Normalization to experiment size factors to account for technical contributions to expt-to-expt variations such as varied capture efficiency while retain biological variations such as differences in total mRNA abundance in cells receiving different treatments. (h) Median total mouse mRNA counts were calculated for each treatment in each experiment and average mRNA counts were compared between different treatments in one experiment (E3, E4-E7) with two-sided student’s t test (α=0.05). Experiment size factors were calculated based on the ratio of median mRNA counts between different treatments. After normalization to the expt size factors, the median mRNA count equals to 1,000,000 for uninfected and DsRed-transduced cells and 616136 for M+G+T-transduction. (i) PCA of two biological replicates E5 and E6 that have different sequencing depth/cell due to different capture efficiencies before (left) and after (right) normalization to experiment size factors. Top 400 PCA genes were used.
Extended Data Figure 2
Extended Data Figure 2. Normalization and outlier removal of single-cell RNA-seq data (continued)
(a-c) Removal of batch effects using ComBat on non-immune (described in Extended Data Fig. 3) cells. PCA of all batches of M+G+T-transduced (a), DsRed-transduced (b) or uninfected (c) cells before (left panel) and after (right panel) ComBat normalization. (d-g) After ComBat normalization, outliers in each treatment group were further removed by examination of the average gene expression level of each cell (box plot, d) and PCA (e). Uninfected cells were shown as an example. A total of 454 healthy non-immune cells were left and further analyzed. (f) Pairwise comparison of average mouse gene expression between different experiments and treatment conditions. Correlation coefficient calculated by linear regression was shown. (g) Heatmap colored by correlation coefficient in (f). Strongest correlation was seen within each treatment group. DsRed-infected and uninfected cells also showed strong inter-treatment correlation. M+G+T-transduced cells showed relatively low correlation with DsRed-transduced and uninfected cells.
Extended Data Figure 3
Extended Data Figure 3. Single cell analysis identified a sub-population of immune-like cells
Data from 513 control or reprogramming CF single cells and bulk RNA-seq data of neonatal CFs and CMs were analyzed with PCA. To identify groups of related cells and genes, top 400 genes with highest loadings in the first three principal components were then analyzed by unsupervised HC (a) and PCA (b, c). Representative genes in each of the four gene clusters identified by HC were listed to the right of HC heatmap. Interestingly, in addition to CM, fibroblast, and cell cycle genes, immune response genes were identified as the other major gene cluster. (b) PCA loading plot showing four major gene clusters. (c) PCA score plot. Both HC and PCA results showed that bulk CF and CM data were very close in distance and both of them were clustered together with single cells expressing high levels of immune genes. (d) Table of markers for major immune cell lineages. (e) Violin plots for the expression of major immune cell lineage markers in bulk CFs, bulk CMs, immune-like single cells, and other single cells. (f) Expression of the macrophage marker Cd14 and the dendritic cell marker Cd11c in each immune-like cell showed that 42 cells express macrophage markers and 3 express dendritic cell markers, with 4 cells expressing both. These data suggest that the immune-like cells are likely cardiac resident immune cells which also express CF marker such as Thy1. Although follow-up work to delineate the potential of these immune cells to be reprogrammed into iCMs will be of great interest, it is not the focus of this study. Therefore, for all following analyses on single cell data, we focused on the non-immune CFs.
Extended Data Figure 4
Extended Data Figure 4. Cell grouping, representative gene expression, and genes detected in single-cell RNA-seq data and population-based gene expression profiling along reprogramming
Related to Fig. 1a-h. (a) PCA scree plot showing variance of top 10 PCs. Related to Fig. 1a-c. (b) Violin plots showing the expression levels of representative cardiac, fibroblast and cell cycle genes in the 7 cell groups identified by HC and PCA (Fig. 1 a-c). (c) Fig. 1a GO analysis with p values of each presented GO term. (d-g) Determination of the proliferation status of each single cell using genes periodically expressed in cell cycle that were identified in a previous report. (d) 3D LLE plot. (e) Frequency of cells on LLE component 3. The dark red plane in (d) and the red dotted line in (e) indicate the threshold for proliferating (Pro) and nonproliferating (NP) cells. (f-g) PCA plots as in Fig. 1c but color- and shape-coded by Pro and NP (f) or CCA and CCI (g). (h-i) tSNE plots of all single cells color- and shape-coded by HC/PCA cell groups (h) or Pro/NP cell groups (i). The cells that were grouped as iFib, piCM, or iCM constituted 30.6% (77/252), 24.6% (62/252), and 44.8% (113/252) of all cells transduced with M+G+T, respectively. In contrast to previous population- and marker-based studies, our single-cell RNA-seq data suggests that the fate conversion from fibroblast to iCM occurs rapidly (~ 3 days) with nearly 45% of the CF cells exhibiting transcriptomic signatures indicative of a cardiac fate. (j) Live fluorescent images of day 5 MGT-transduced CFs showing co-expression of αMHC-GFP and Thy1 (surface labeling). Double positive cells were labeled with *. All under 40×. Scale bar = 100 μm. (k) αMHC-GFP+Thy1+ and αMHC-GFP-Thy1+ cells were FACS-sorted from day 7 MGT-transduced CFs and expression of representative cardiac (Myl4/Actc1) and fibroblast (Col3a1/Postn) markers were determined by qRT-PCR. Day 7 mock-transduced cells were included as control. Average ± SD were shown. n = 4 samples. One-way ANOVA followed by Bonferroni correction (two-sided): ** p<0.01, *** p<0.001, ns, lack of enough evidence for significance. Myl4 and Actc1 expression increased 80-100 fold and reached approximately the same level as Gapdh in αMHC-GFP+/Thy1+ cells compared to mock transduction. Expression level of the fibroblast marker Postn was maintained at a high level in GFP+Thy1+ cells. For another fibroblast marker Col3a1, even though its relative expression in GFP+Thy1+ cells was decreased compared to Mock-transduced and GFP-Thy1+ cells, but its absolute expression was still high by comparing to Gapdh (~1.4 fold of Gapdh). The data strongly support the existence of CM- and fibroblast-marker double-positive piCM and suggest that piCM represents an intermediate cell population transitioning from iFib to iCM or locked between iFib and iCM during reprogramming. (l) To determine if iCMs may be differentiated from rare cardiac stem/progenitor cells, we plotted the expression of cardiac stem/progenitor markers in each of the HC/PCA single cell groups using violin plots. All of these markers were nearly undetectable in Fib, iFib, piCM and iCM, suggesting direct conversion from CF to iCM without going through a stem/progenitor stage. (m) Distribution of gene expression levels in single cells. Average ± SEM were shown. n = 454 cells. Limit of gene detection was set to 1 based on this plot. (n) Distribution of number of genes detected in all, CCI or CCA single cells. Comparison of the distributions in CCI and CCA cells with two-sample Kolmogorov-Smirnov test resulted in a one-sided p-value of 5.248e-11, suggesting that the number of genes in CCI is significantly smaller than that of CCA. Based on this result, only CCI cells were used in (o). (o) Distribution of number of genes detected in each CCI cell group shown by histogram. One-sided two-sample Kolmogorov-Smirnov test (p values: 0.00521 for iFib vs Fib, 0.00481 for piCM vs iFib, and 1.104e-6 for iCM vs piCM) suggests that the number of genes expressed decreased when the cells adopted the iCM fate. This observation demonstrates a dynamic re-patterning of transcription machinery during reprogramming and is consistent with HC analysis and experimental evidences that piCMs co-expressed both cardiac and fibroblast markers, further indicating that piCM constitutes an intermediate population during iCM reprogramming. (p-v) Population-based gene expression profiling of reprogramming CFs at day 0, 3, 5, 7, 10, and 14. (p-q) Results from PCA analysis using all genes were similar to those using top 400 genes (r-v). (p) Scree plot of top 10 PCs. (q) 3D PCA score plot. (r-v) Analyses with top 400 PCA genes. Related to Fig. 1 g-h. (r) Scree plot of top 10 PCs. (s) PCA score plot using PC1 and PC3. (t) HC identified four major gene clusters: gradually upregulated along reprogramming (Red, mainly cardiac genes), downregulated in MGT-transduced compared to lacZ-transduced (blue, mainly ECM genes), and gradually upregulated (light grey)/downregulated (dark grey) in both lacZ and MGT cells (culture or viral effects, mainly ECM and immune response genes). The results were consistent with the expression of representative genes selected from single cell data (Fig. 1h) showing gradually increased expression of CM markers along reprogramming, first increased and then decreased expression of cell cycle genes in both MGT and lacZ cells, and significantly lower fibroblast markers in MGT compared to lacZ cells at each time point. (u) Heatmap showing the loading of the genes in (t) on PC1, 2 and 3. Upregulated (cardiac) genes are highly weighted in PC1, and the other three gene clusters are highly weighted in PC2 and PC3. The results are consistent with Fig. 1g and (s). (v) GO analysis of the four gene clusters in (t) showing GO terms and their corresponding p-values (listed on the right).
Extended Data Figure 5
Extended Data Figure 5. Inhibition of cell proliferation or cell cycle synchronization promotes iCM reprogramming
Related to Fig. 1i-p. (a) Comparison of the ratio of CCA: CCI cells in the three treatment groups: uninfected, DsRed-infected, and M+G+T-infected. Chi-square test suggests that proliferation states were not significantly different among the treatment groups at day 3. (b, c) Knockdown (KD) efficiency of shRNAs (b) or overexpression (OE) levels (c) of cell cycle-related genes were determined by qRT-PCR on day 4 lentiviral transduced cells. shNT, non-targeting control shRNA. Average ± SD was shown. n = 3 samples. (d-i) Cell cycle staging of CF cells simultaneously transduced with reprogramming factors and shRNA (e, g) or OE (f, h) constructs by PI staining. (e-f) Flow cytometry histogram of PI staining intensity. (g-i) Percentages of cells in G0/G1, S, or G2/M phases were calculated based on (e) and (f). (i) Summary of (g) and (h). (j-m) Measurement of DNA synthesis in CF cells simultaneously transduced with reprogramming factors and shRNA (k) or OE (l) constructs by EdU incorporation assay followed by flow cytometry. dMFI: delta median EdU fluorescence intensity between EdU+ cells and EdU− cells. (m) Summary of (k) and (l). Constructs that dramatically decreased or increased cell proliferation were labeled in red in (g-i) and (k-m) and were used for experiments in (n-s). (n-s) The impact of manipulation of cell proliferation through KD/OE of cell cycle-related genes on iCM reprogramming. Reprogramming factors were introduced by lentiviral vectors instead of retroviral vectors to avoid retroviral infection bias of proliferating cells. CF were simultaneously transduced with lentiviral M/G/T (n-p) or inducible MGT (iMGT, q-s) and lentiviral KD/OE constructs that dramatically decreased (o, r) or increased (p, s) cell proliferation. Percentages of αMHC-GFP+ and cTnT+ cells were quantified by flow cytometry. (t-z) The impact of large T transduction on iCM reprogramming. CFs were simultaneously transduced with reprogramming factors and lentiviral large T. After 10 days, αActinin+/cTnT+ cells were immunostained, imaged, and quantified by counting randomly selected 20× fields from multiple repeated experiments (u-x). Both percentages of positive cells per field (v, x) and numbers of + cells per field (w) were quantified. Percentages of cells showing sarcomere structure in αActinin+ cells were also quantified (y-z). The percentage of αActinin+ cells that show sarcomere structures decreased from 50% to 0% upon large T transduction and accelerated proliferation. (u, y) Representative images under 40× with hoechst nuclear staining. Scale bar = 100 μm. (o-z) Average ± SEM was shown. (o-s) n = 4 samples. (v, w, z) n = 20 images. (x) n = 10 images. (b, c, w-z) Two-sided student’s t test. (o-s) One-way ANOVA followed by Bonferroni correction (two-sided). Significance: * p<0.05, ** p<0.01, *** p<0.001.
Extended Data Figure 6
Extended Data Figure 6. Heterogeneity of our isolated CFs (Thy1+ non-immune non-myocyte cardiac cells) and stepwise suppression of non-cardiomyocyte lineages during iCM reprogramming
Related to Fig. 2. (a-c) Limited transcriptome change by retrovirus transduction. To determine whether introduction of viruses could influence cellular identities of CF, molecular features of the uninfected and DsRed-transduced cells were compared and only 25 genes were differentially expressed (ANOVA p value < 0.05), many of which related to immune response (data not shown), suggesting that uninfected and viral-infected CFs shared very similar gene expression profiles. (a) PCA of the control cells from expt E3 as shown in Fig. 2b but color-coded by treatment. The results showed that uninfected and DsRed-transduced cells were indistinguishable by PCA, suggesting limited global transcriptome changes by retroviral transduction. (b) Violin plots showing the expression of representative CM, fibroblast, and cell cycle genes in uninfected- and DsRed-transduced CF. Retroviral transduction does not affect the expression of these genes. (c) Same as (a) but with all control CF from expt E3, E5R, E6R, and E7. Based on results from (a-c), we concluded that retroviral transduction does not influence cellular identities of CF and therefore we analyzed control CFs containing both uninfected and DsRed-transduced cells together in Fig. 2a-b. (d) Fig. 2a GO analysis showing p values of each presented GO terms. (e) Violin plots showing the expression of additional non-myocyte lineage markers in CF. Related to Fig. 2a. (f) PCA analysis of control CF from all four expts (E3, E5R, E6R, and E7) with cells color-coded by non-myocyte lineage groups. Related to Fig. 2b. (g) Immunostaining of Thy1 and αSMA, or Thy1 and CD31 in day 7 explant CF culture. Images taken at 20×. Scale bar = 200 μm. (h-i) Representative flow cytometry plots (h) and quantification (i) of αSMA+ and CD31+ cells in Thy1+ cells. There were 72.6% αSMA+ and 9% Cd31+ CFs, consistent with the single-cell RNA-seq data in Fig. 2a showing a high percentage of cells expressing myofibroblast/smooth muscle markers and a low percentage of cells expressing endothelial markers. Average ± SD was shown. (j) Violin plots showing the expression levels of additional lineage markers. Related to figure 2c. (k) Same as (j) but using cells from expt E4-E7. These experiments were performed using the redesigned Fludigm medium chip as a repeat of expt E1-E3 (Fig. 2c), which used the original Fludigm medium chip. (l-m) Tracking of protein expression of a myofibroblast/smooth muscle cell marker αSMA by co-staining with αMHC-GFP in CF cells under reprogramming for 5, 7, 10, 14, and 21 days. (l) Representative images under 40× with hoechst nuclear staining. Scale bar = 100 μm. (m) Quantification of αMHC-GFP+ αSMA-high/low/neg cells. Average ± SEM was shown. The results showed that as reprogramming proceeded, protein expression of Thy1, SM22α (Fig. 2d-f), and αSMA in αMHC-GFP-positive cells decreased over time, with no Thy1-/SM22α-/αSMA-high cells and ~50-60% of Thy1-/SM22α-/αSMA-negative cells on day 21 of reprogramming.
Extended Data Figure 7
Extended Data Figure 7. Identification of regulatory pathways involved in iCM reprogramming and screening of a shRNA library against major splicing factors during iCM induction
(a-d) Three clusters of genes that significantly related to and showed similar trends over the reprogramming process were identified by nonlinear regression (see Methods). Number of genes included in each cluster is shown in parentheses. The solid line in each plot shows the overall trend of the cluster, and the grey color indicates the 2D density of gene trends passing through each region of the plot. (b-d) GO analysis of genes in the three clusters showing GO terms with FDR < 0.05. (e, f) Screening against a shRNA library of splicing factors for key regulators of iCM reprogramming. icMEF was induced by Dox to express MGT and at the same time transduced with lentiviruses encoding shRNA targeting various splicing factors. On day 3 post transduction, knockdown efficiency was determined by qRT-PCR (e, n=6 samples from 2 independent experiments) and αMHC-GFP+ cells were quantified by flow (f, n=3 samples, data representative of three independent experiments). Average ± SD was shown. Knocking down of Ptbp1 led to the highest fold increase in percentage of aMHC-GFP+ cells compared to shNT. (g, h) Ptbp1 expression in freshly isolated CF and CM was determined by qRT-PCR (g, average ± SEM, n=8 samples from two independent experiments) or Western blotting (h). (e, g) Two-sided student’s t test. Significance: ** p<0.01, *** p<0.001.
Extended Data Figure 8
Extended Data Figure 8. Manipulation of Ptbp1 through loss- and gain-of-function during iCM reprogramming
(a, b) Ptbp1 knock-down efficiency of different shRNA clones in d3 transduced MEF determined by qRT-PCR (a, average ± SEM, data representative of three independent experiments) or western blotting (b). shPtbp1-271 showed the highest knock-down efficiency (>97%) and was used for following experiments. (c-u) Ptbp1 was knocked-down (shPtbp1, c-p) or overexpressed (lentiviral OE-Ptbp1, q-u) in neoCF (c-h, q-u), AdCF (i-l), or AdTTF (m-p) when iCM reprogramming was induced by MGT (except in e-h, M+G+T was used as a further confirmation). After 10 days (14 days for OE-Ptbp1), expression of cardiac markers was determined by immunostaining followed by imaging and blinded quantification (e, f, i, j, m, n, r, s) or flow (c, d, g, h, k, l, o, p, t). (e, i, m, r) Representative 20× images with hoechst nuclear staining. Scale bar = 200 μm. (f, j, n, s) n = 10-20 images. Average ± SEM was shown. (c, g, k, o) Representative flow plots. Percentages of cells were shown. (d, h. l, p, t) Quantification of triplicated flow data. Average ± SD was shown. (q) Ptbp1 overexpression was verified by qRT-PCR (average ± SD). (u) Expression levels of representative cardiac (left axis, Tnnt2, Actc1 and Ryr2) and fibroblast (right axis, Col3a1) markers were determined by qRT-PCR (average ± SD). Mock: untransduced CF. Where appropriate, two-sided student’s t test or one-way ANOVA followed by Bonferroni correction (two-sided) was performed. Significance: * p<0.05, ** p<0.01, *** p<0.001, ns, lack of enough evidence for significance.
Extended Data Figure 9
Extended Data Figure 9. Splicing re-patterning and transcriptome shift underlying shPtbp1-mediated enhancement of iCM reprogramming, and correlation of M/G/T expression and reprogramming
(a-n) Splicing analyses of d3 reprogramming cells upon Ptbp1 silencing. Related to Fig. 3j-q. (a) Number of overlapping and non-overlapping AS events identified between MGT vs lacZ and MGT+shPtbp1 vs MGT+shNT. The minimal overlap suggests that Ptbp1 knockdown caused extensive re-patterning of the splicing landscape during iCM induction. (b) Number of ES events that skip (grey) or include (red) the exon more in MGT+shPtbp1 compared to MGT+shNT. (c) GO analysis of AS genes between MGT and lacZ. (d-e) A total of 138 AS events (83 genes) between MGT+shPtbp1 and MGT+shNT were reported by rMATS to be the most significant (p value = 0, which was actually < 1e-16). (d) Top 10 events ranked by junction counts include 4 events on tropomyosin genes (Tpm1 and Tpm2). Tropomyosins are critical genes for muscle contraction and they are known to undergo extensive alternative splicing. The most studied tropomyosin AS events have been mutually exclusive exons and it is interesting to find here two ES events of Tpm1 as the top 2 AS events upon Ptbp1 knockdown during reprogramming. (e) GO analysis of these most significant AS genes. (f-i) Sashimi plots for the rest representative AS events in the blue-labeled GO terms in Fig. 3n. The event shown in Fig. 3o was on Mbnl, which is a critical splicing factor for cardiac function that switches isoforms during heart development. The event shown in (h) was an exon skipping event on Tpm1 exon 3 (exon 2b in older literature), which was also the top event in (d). This exon3 skipped isoform of Tpm1 (Tpm1α) is the one enriched in cardiac and striated muscle cells, regulating the assembly and functionality of actin filament for contraction. (j) Overlap of AS genes and DEGs between MGT+shPtbp1 vs MGT+shNT cells. (k) Overlap of DEGs between MGT vs lacZ and MGT+shPtbp1 vs MGT+shNT. (l) Percentage of overlapping DEGs in (k) showing the same or opposite direction of changes. (m-o) Based on (k), top GO terms for DEGs only between MGT and lacZ (m), overlapping DEGs (n), and DEGs only between MGT+shPtbp1 and MGT+shNT (o) were shown. (p-s) Correlation of M/G/T expression and reprogramming. Related to Fig. 4a. (p) Correlation between the total expression of M+G+T in individual cells and the SLICER-calculated reprogramming progress of each cell. Trendline and the correlation coefficient by linear regression was shown (p = 3.9e-78, α=0.05, two-sided). (q-s) Expression levels of M, G, and T in Fib, iFib, piCM, and iCM plotted as average ± SEM (q) or violin plots to show distribution (r). (s) Ratios of expression levels of M, G, and T in the four cell groups. Average ± SEM was shown. (t-u) Spearman correlation between M, G, T expression and the expression of 178 known and predicted splicing factors or 1602 additional transcription factors. Genes with correlation coefficient > 0.3 or < −0.3 with one or more of M, G, T were selected and the inter-correlation matrix of the 17 selected splicing factors (t) and the 65 selected transcription factors (u) were calculated and plotted as heatmap. The splicing factors Mbnl1 and Rbms3 are strongly anti-correlated with M, G, T’s expression and Rbm20 is the only factor that is positively correlated with M, G, T expression (p values < 1e-7 by two-sided Spearman correlation, α=0.05). In (u), two sets of genes, A and B, were found to be strongly anti-correlated with M, G, T expression and meanwhile strongly co-expressed. These genes include Id1, Id2, Id3, Tcf21, and Foxp1 (p values < 1e-15 by two-sided Spearman correlation, α=0.05) that might serve as “secondary” key factors to further trigger the activation/inhibition of downstream cascades for successful conversion from fibroblasts to iCMs.
Extended Data Figure 10
Extended Data Figure 10. Putative markers for iCM and piCM
ANOVA identified 7624 DEGs among Fib, iFib, piCM, and iCM. There were 954/285 candidates for negative/positive selection markers of iCM and 55 candidates for positive markers of piCM. These candidates were expressed lowest/highest in iCM and highest in piCM, respectively. No gene passed the selection criteria for negative markers of piCM. Top candidates were selected by largest fold change of expression in the cell population of interest compared to that in Fib. (a) Violin plots showing the expression of non-surface genes in top 30 candidates for negative markers of iCM. Related to Fig. 4d. (b-e) Top 30 candidates for positive selection markers of iCM (b, c) or piCM (d, e). (b, d) Fold change of gene expression in iCM/Fib (b) or piCM/Fib (d). (c, e) Violin plots of the same genes in 4 cell populations. (f) Top 30 genes showing largest expression fold change in piCM and iCM. (g-n) Effect of Cd200 knockdown (g-j) or overexpression (k-n) on iCM reprogramming. CF was untransduced (Mock), or simultaneously transduced with MGT and lentiviral shNT/shCd200 or lacZ/OE-Cd200 for 14 days. Knockdown or overexpression efficiency was verified by qRT-PCR (g, k). Average ± SD was shown. n=3 samples. Percentages of αMHC-GFP+, cTnT+, and Cx43+ cells were determined by immunostaining followed by imaging and blinded quantification (h-i, l-m) with representative 20× images in (h, l). Scale bar = 200 μm. n= 20 (i) or 10 (m) images. Average ± SEM was shown. Percentages of αMHC-GFP+, cTnT+, and double-positive cells were also quantified by flow (j, n). Average ± SD was shown. n=3 samples. Two-sided student’s t test was used. Significance: *** p<0.001, ns, lack of enough evidence for significance.
Figure 1
Figure 1. Single-cell RNA-seq reconstructs iCM reprogramming and identifies intermediate cell populations
(a) HC results of 454 single CFs infected with M+G+T or mock- or DsRed- for 3 days with representative gene ontology (GO) terms of the three identified gene clusters below. (b-c) PCA showing representative genes (b) or cell groups (c). (d-e) 3D trajectory constructed by SLICER showing HC/PCA cell groups (d) or pseudotime (e). (f) Free energy of the reprogramming process. (g-h) Microarray of MGT- or LacZ-transduced CFs from day 0-14 plotted in PCA (g) or heatmap (h) showing average expression of representative genes in (a-b). (i) Comparison of CCA:CCI ratio in iFib, piCM, and iCM. (j-p) Cell cycle synchronization (j-l) or immortalization (m-p) of CFs for iCM induction (see Methods). Quantification of flow analysis shown in (k,l), n=4 samples. Representative 40× images of aActinin/cTnT with hoechst shown in (n) with quantification in (o,p). n=30 images, scale bar=100 μm, error bars indicate SEM, two-sided student’s t test: * p<0.05, ** p<0.01, *** p<0.001.
Figure 2
Figure 2. Heterogeneity of CF and stepwise suppression of non-cardiomyocyte lineages during iCM induction
(a-b) HC (a) and PCA (b) of control CFs with representative gene expression and GO analysis of the five identified gene clusters. (c) HC calculated with control CFs (a) applied to M+G+T-transduced cells with representative gene expression. (d-f) 40× ICC images (d,e) with quantifications (f) of Thy1 and SM22α co-stained with αMHC-GFP during reprogramming. n=20 images, scale bar=100 μm, error bars indicate SEM.
Figure 3
Figure 3. Identification of Ptbp1 as a barrier to iCM splicing repatterning
(a-g) Six gene clusters identified along reprogramming (a) with GO analysis (b-g, false discovery rate, FDR<0.05). Number of genes shown in parentheses. (h-i) 20× ICC images of cTnT and αMHC-GFP (h) with quantification (i) of MGT-infected CFs treated with shRNA against Ptbp1 (shPtbp1) or shRNA control (shNT). n=20 images, scale bar=200 μm, error bars indicate SEM, two-sided student’s t test: *** p<0.001. (j-q) Splicing analyses of d3 MGT-infected CFs treated with shPtbp1 or shNT. (j-k) Correlation between dPSI of CM vs CF, and dPSI of MGT vs lacZ (j) or MGT+shPtbp1 vs MGT+shNT (k). Trendline by linear regression and p from one-sided binomial test were shown. (l) Number of detected AS events among the five AS types. MEX: mutually exclusive exon. A3SS/A5SS: alternative 3′/5′ splicing site. IR: intron retention. (m) Positional distribution of a Ptbp1 binding motif (sequence shown in the black square). Dashed black line indicates p=0.05. (n-o) GO analysis of AS genes between MGT+shPtbp1 and MGT+shNT (n) with representative sashimi plot (o). (p-q) Expression of overlapping genes between DEG (MGT+shPtbp1 vs MGT+shNT) and DEG (MGT vs lacZ) (p) and shPtbp1-only DEGs (q).
Figure 4
Figure 4. M/G/T-determined iCM reprogramming and identification of novel surface markers
(a) Correlation between M/G/T expression and SLICER pseudotime. (b) Left: correlation between Tbx5 expression and its targets with GO analysis. Right: inter-correlation of genes on left. Three sets of co-expressed genes A, B, C were shown (p<2.6e-6). (c) Top 20 potential negative selection markers for iCM. (d) Correlation of the four surface marker (labeled in red in c) expression and reprogramming (left) and their expression in different cell groups (right violin plots). (e-f) 40× ICC images (e) with quantification (f) of Cd200 co-stained with αMHC-GFP along reprogramming. n=20 images, scale bar=100 μm, error bars indicate SEM, linear regression reports p<1e-41 (a), and p<1e-39 (d), α=0.05, two-sided.

References

    1. Ieda M, et al. Direct reprogramming of fibroblasts into functional cardiomyocytes by defined factors. Cell. 2010;142:375–386. doi: 10.1016/j.cell.2010.07.002. - DOI - PMC - PubMed
    1. Jayawardena TM, et al. MicroRNA-mediated in vitro and in vivo direct reprogramming of cardiac fibroblasts to cardiomyocytes. Circulation research. 2012;110:1465–1473. doi: 10.1161/CIRCRESAHA.112.269035. - DOI - PMC - PubMed
    1. Qian L, et al. In vivo reprogramming of murine cardiac fibroblasts into induced cardiomyocytes. Nature. 2012;485:593–598. doi: 10.1038/nature11044. - DOI - PMC - PubMed
    1. Song K, et al. Heart repair by reprogramming non-myocytes with cardiac transcription factors. Nature. 2012;485:599–604. doi: 10.1038/nature11139. - DOI - PMC - PubMed
    1. Dal-Pra S, Hodgkinson CP, Mirotsou M, Kirste I, Dzau VJ. Demethylation of H3K27 Is Essential for the Induction of Direct Cardiac Reprogramming by miR Combo. Circulation research. 2017;120:1403–1413. doi: 10.1161/CIRCRESAHA.116.308741. - DOI - PMC - PubMed

Publication types

MeSH terms