Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 17;12(2):159-175.e9.
doi: 10.1016/j.cels.2020.10.010. Epub 2020 Dec 30.

Cross-Comparison of Human iPSC Motor Neuron Models of Familial and Sporadic ALS Reveals Early and Convergent Transcriptomic Disease Signatures

Affiliations

Cross-Comparison of Human iPSC Motor Neuron Models of Familial and Sporadic ALS Reveals Early and Convergent Transcriptomic Disease Signatures

Ritchie Ho et al. Cell Syst. .

Abstract

Induced pluripotent stem cell (iPSC)-derived neural cultures from amyotrophic lateral sclerosis (ALS) patients can model disease phenotypes. However, heterogeneity arising from genetic and experimental variability limits their utility, impacting reproducibility and the ability to track cellular origins of pathogenesis. Here, we present methodologies using single-cell RNA sequencing (scRNA-seq) analysis to address these limitations. By repeatedly differentiating and applying scRNA-seq to motor neurons (MNs) from healthy, familial ALS, sporadic ALS, and genome-edited iPSC lines across multiple patients, batches, and platforms, we account for genetic and experimental variability toward identifying unified and reproducible ALS signatures. Combining HOX and developmental gene expression with global clustering, we anatomically classified cells into rostrocaudal, progenitor, and postmitotic identities. By relaxing statistical thresholds, we discovered genes in iPSC-MNs that were concordantly dysregulated in postmortem MNs and yielded predictive ALS markers in other human and mouse models. Our approach thus revealed early, convergent, and MN-resolved signatures of ALS.

Keywords: ALS; ELAVL3; iPSC; single-cell RNA-seq.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. iPSC-MN cultures recapitulate developmental gene expression patterns.
A. Immunostaining day 18 cultures from three individual subject lines for expression of MN markers ISL1 and SMI-32. B. Experiment schematic for analyzing the 18 day time course of MN differentiation. C - D. Violin plots indicate gene expression during MN differentiation. C depicts number of unique molecular identifiers (nUMI), and D depicts number of detectable genes per time point. E. tSNE of MN differentiation time course samples. F. Monocle analysis projects samples into a pseudo-time axis consistent with the order of time points. G. Histogram displays meta data for all samples profiled in this study. This data is also presented in Figure S4. H. UMI counts were summed for each gene across the expression matrix for each sample, simulating bulk RNA-seq data. Simulated bulk gene expression profiles for each sample were correlated to bulk RNA-seq gene expression profiles from fetal hindbrain and spinal cord at various Carnegie stages analyzed in de Kovel et al., 2017. For each row of pairwise correlations, the top three ranked correlations are outlined, indicating which sample from de Kovel et al., 2017 is most globally similar to each sample analyzed by single-cell RNA-seq. For each column of pairwise correlations, the median Spearman correlation was calculated and displayed along the bottom row. These data summarize for each sample from de Kovel et al., 2017 what is the frequency of global gene expression correlations from iPSC-MN cultures. The top three ranked correlations along the bottom row are outlined, indicating which sample from de Kovel et al., 2017 are most globally similar to iPSC-MNs. See also Figures S1, S2, S3, and S4, and Table S1.
Figure 2.
Figure 2.. iPSC-MN cultures globally resemble rhombomere eight and cervical spinal cord.
A. Models of cell type organization along representative hindbrain and spinal cord segments. Left, HOX gene expression patterns determine rostrocaudal axis identities, adapted and simplified from Di Bonito et al., 2013; Lippmann et al., 2015; and Philippidou and Dasen, 2013. Center, a simplified, two-dimensional schematic of the dorsoventral and mediolateral axes of a representative hemisection of a spinal cord segment adapted from Alaynick et al., 2011 and Lu et al., 2015. Each subclass of progenitors in the ventricular zone (VZ) are labeled along the dorsoventral axis, represented by the y-axis, are colored distinctly. As each of these progenitors give rise to subclasses of postmitotic neurons, they migrate laterally into distinct dorsoventral and mediolateral locations in the mantle zone (MZ), represented by the x-axis. These progenies are labeled and colored similarly to their respective progenitors. Descriptions of each cell type are indicated in Table S2B. Locations of each cell type are approximately relative to each other and are not to scale. Right, three-dimensional model of the center panel, including both hemispheres of the spinal cord. All cell types present within the hindbrain and spinal cord are registered based on HOX and lineage-specific transcription factor expression. This model is not to scale. B. Left: Unit expression matrix of HOX genes that developmentally determine hindbrain and spinal cord segment identity. R2-8: rhombomeres 2-8, Ce: cervical, Br: brachial, Th: thoracic, Lu: lumbar, Sa: sacral, Ca: caudal. Right: Expression heatmap of HOX genes in fetal hindbrain and spinal cord at various Carnegie stages analyzed in de Kovel et al., 2017. Labels below each sample column indicate the segment that each sample most resembles, based on the highest Spearman correlation to the expression patterns for each segment shown in the unit expression matrix and with Benjamini-Hochberg adjusted P-values < 0.05. C. Histogram of segment identities assigned to fetal hindbrain samples, fetal spinal cord samples, and individual cells analyzed from day 18 cultures based on the highest Spearman correlation to the expression patterns for each segment shown in the unit expression matrix. NA: not assigned. For assignment of fetal hindbrain and fetal spinal cord samples from de Kovel et al., 2017, samples with Benjamini-Hochberg adjusted P-values less than 0.05 are assigned to each class along the y-axis; all other other samples are assigned as NA. For assignment of cells in day 18 cultures, cells with Benjamini-Hochberg adjusted P-values less than 0.1 are assigned to each class along the y-axis; all other cells are assigned as NA. See also Figure S5, and Table S2.
Figure 3.
Figure 3.. Developmental gene expression profiles and global clustering classify VZ progenitor and MZ postmitotic neuronal identities.
A. Based on the expression profile of 105 developmental genes (Table S2B), individual cells from the time course of differentiation over 18 days were classified as identities belonging to the VZ, MZ, astrocyte, or not assigned (NA). B. tSNE showing global clustering of 17,531 cells from day 18 cultures across experimental batches, which by Seurat determined four clusters indicated by four colors. Significantly differentially expressed genes in each cluster relative to the other clusters were analyzed for enriched GO terms, and these are displayed below the tSNE plot. C. The same tSNE as in B with each cell colored by classification based on 105 developmental genes. D. The same tSNE as in B with each cell colored by relative expression level of each indicated gene, which were determined to be among the most significantly differentially expressed genes in clusters 1 (SOX2, TOP2A), 2 (COL3A1, TAGLN), 3 (STMN2, ONECUT2), and 4 (PHOX2A, PHOX2B). These data are also presented in Figure S3G. E. tSNE showing global clustering of 10,866 cells classified into clusters 3 and 4 in B. Seurat analysis determined 18 clusters. F. The same tSNE as in E with each cell colored red if they were classified as the indicated cell type based on 105 developmental genes. G. The same tSNE as in E with each cell colored by their reclassification into clusters formed by merging the clusters identified in E based on overlap for identities shown in F. This data is also presented in Figure 4. H. The same tSNE as in E with each cell colored by relative expression level of each indicated gene, which are regarded as distinct markers of each identity listed above each plot in F. I. Boxplots of Spearman correlation values between gene expression profiles of LC MNs from Ho et al., 2016; summed gene expression profiles across all cells for each day 18 sample (Bulk); and summed gene expression profiles across cells classified as MN in G. P-value calculated from a paired, two-tailed t-test. J. Bar graph of segment identities assigned to individual cells within each of the seven clusters based on the highest Spearman correlation to the HOX expression patterns for each segment shown in the unit expression matrix (Table S2A) and with Benjamini-Hochberg adjusted P-value < 0.1. R2-8: rhombomeres 2-8, Ce: cervical, Br: brachial, Th: thoracic, Lu: lumbar, Sa: sacral, Ca: caudal, NA: not assigned. See also Figures S4, S5, and S6, and Table S2.
Figure 4.
Figure 4.. ALS enacts distinct gene expression changes in neuronal subtypes.
A. Left to right: same tSNE as in Figure 3G, with each cell colored by ALS or control (CTR) condition, colored by experimental platform, and bar graph quantifying the number of cells assigned to each class. B. Jaccard indices of overlapping gene sets across ALS to CTR comparisons within the neuronal subtypes MN, V1 Renshaw, and V2a. Each pairwise ALS to CTR comparison is shown along left and top axes, and heatmap indicates Jaccard index for each intersection of gene sets. The number of upregulated genes in each ALS to CTR comparison is printed in red along the right axis, and the number of downregulated genes in each ALS to CTR comparison is printed in blue along the bottom axis. Upregulated gene sets were only compared to other upregulated gene sets, and downregulated gene sets were only compared to other downregulated gene sets. The red and blue heat maps represent the Jaccard indices for intersections of upregulated and downregulated gene sets, respectively. When an ALS cell line was compared to a CTR or isogenic HRE-corrected (ISO) cell line in more than one experimental batch, the Jaccard indices of these gene sets (replicate comparisons) can be compared to Jaccard indices of gene sets calculated for comparisons using the same ALS, CTR, or ISO cells lines within the same experimental batch that were not repeated in other batches (non-replicate comparisons) in C. Panels with solid squares indicate replicate comparisons (n = 7) and open squares indicate non-replicate comparisons (n = 41). These are only indicated for upregulated genes in MNs, but the same conditions were analyzed for downregulated genes and V1 Renshaw and V2a in C. C. Comparison of upregulated and downregulated genes sets among replicate comparisons and non-replicate comparisons described in B. P-values were calculated using the Wilcox test. D. Left: Differentially expressed genes between ALS and control conditions for neuronal subtypes MN, V1 Renshaw, V2a, and simulated bulk RNA-seq. Genes were categorized as upregulated or downregulated by ALS, and Chow-Ruskey diagrams present the intersection among each neuronal subtype as well as with bulk comparisons. Right: Euler diagrams present the intersection of GO terms enriched among genes that are reproducibly dysregulated by ALS uniquely within each neuronal subtype in the upregulated or downregulated categories. Representative GO terms are displayed from each overlapping or unique set. See Table S7H-M for full set of uniquely enriched GO terms. No GO terms were significantly enriched among genes upregulated or downregulated by ALS in bulk comparisons, and no GO terms were enriched among genes uniquely downregulated by ALS in V2a interneurons. See Tables S3-7 and Methods for details on how differentially expressed gene and GO lists were generated. See also Figure S6 and Tables S3, S4, S5, S6, and S7.
Figure 5.
Figure 5.. ALS iPSC-MN cultures exhibit transcriptional changes detectable in postmortem ALS spinal MNs.
A. Hypergeometric test for each of 52 modules defined from WGCNA on LC MNs (data set A) from postmortem sporadic ALS (sALS) and non-ALS subjects (Ho et al., 2016) to detect enrichment for ALS dysregulated genes from each of the neuronal subtype categories identified in Figure 4D. Upper panel: WGCNA modules and the sample traits (sALS disease status and PC1 sALS component) with which they are significantly associated outlined in black rectangles (Pearson correlation between module eigengene and sample trait, Bonferroni-adjusted P-value < 0.01). Z-summary value for each module measures the extent of module preservation between data set A and B (Krach et al., 2018) (Figure 5B). Bar graphs above and to the right indicate the number of genes represented in data set A and scRNA-seq data set, respectively. A matrix of P-values from hypergeometric tests performed for each module to neuronal subtype category comparison were adjusted by the Benjamini–Hochberg method, and subsequent P-values < 0.05 are marked as black squares in the matrix. B. As in A, except for 32 modules define from data set B (Krach et al., 2018). C. Euler diagram of intersecting gene sets among the Magenta module from data set A, the Steelblue module from data set B, and the MN upregulated in ALS genes. P-values indicate the likelihood of multi-set intersections using SuperExactTest (Wang et al., 2015). Genes are listed for overlapping sets, bolded genes are associated with the representative GO terms listed below. D. As in C, except among the Blue module from data set A (left), the Midnightblue module from data set A (right), the Darkgreen module from data set B, and the MN downregulated in ALS genes. E. Split dot plots indicate percent of MNs within each sample that express a non-zero value of each gene and also indicate the average gene expression among MNs within each sample that express non-zero values of that gene. F. Scatterplots depict gene expression in each LC MN sample from data set A (y-axis) against the coordinate for that sample along the first principal component (PC1 sALS component, x-axis). Prior to performing PCA on the expression data set, the ten genes shown in this panel were removed from the expression matrix to eliminate autocorrelation. sALS samples are colored red, and control samples are colored blue. The Spearman correlation between expression and PC1 coordinate is indicated next to the gene symbol, and the nominal P-value of the correlation is indicated below the gene symbol. G. As in F, except applied to data set B. See also Figure S6.
Figure 6.
Figure 6.. ALS iPSC-MNs exhibit early transcriptional changes counteracting homeostatic maturation and aging.
A. Euler diagram of intersecting gene sets among the Magenta, Blue, Midnightblue, or Yellow modules from data set A, the MN dysregulated genes in ALS identified in Figure 4D, and genes assigned to modules that significantly correlate with MN maturation and aging (Ho et al., 2016). The number of genes within each set is indicated. P-values indicate the likelihood of multi-set intersections using SuperExactTest (Wang et al., 2015). Genes with pathogenic variants in the ClinVar database are indicated on the right. B. Heatmap of genes listed in A demonstrating expression kinetics as tissues progress from embryonic, fetal, and adult spinal cord tissues (Ho et al., 2016). Pluripotent stem cells (PSCs) include embryonic stem cells and iPSCs.
Figure 7.
Figure 7.. Single-cell analysis of iPSC-MNs reveals predictive ALS marker genes.
A. Split dot plots indicate percent of MNs within each sample that express a non-zero value of each of six predictive marker genes. Plots also indicate the average gene expression among MNs within each sample that express non-zero values of that gene. B. Heatmap of five genes listed in A, along with NNAT demonstrating expression kinetics as tissues progress from embryonic, fetal, and adult spinal cord tissues (Ho et al., 2016). CARS2 was not annotated in this data set. Pluripotent stem cells (PSCs) include embryonic stem cells and iPSCs. C - G. Receiver-operator characteristic (ROC) analysis performed on several data sets classifying samples in each data set as ALS or non-ALS, based on sample coordinates along the first, second, or both principal components using six predictive marker genes. See methods for calculations. P-values for each area under the curve (AUC) are calculated using the Mann-Whitney U test to determine whether the AUC differs significantly from 0.5 (diagonal grey line), which indicates an uninformative test. The number of samples for each ALS and non-ALS case used in the analysis, along with their genotypes are indicated. The number of genetically distinct subjects included in the analysis are indicated in parentheses. Familial ALS subjects with pathogenic variants in C9orf72 (C9O), CHMP2B (CHM), FUS, SOD1 (SOD), TARDBP (TDP). Sporadic ALS subjects with no known pathogenic variants (SPO). Non-ALS control subjects (CTR). C9orf72 or SOD1 subject lines in which the pathogenic variant has been genome edited (ISO). Spinal muscular atrophy subjects classified as non-ALS (SMA). C. ROC analysis performed on subpopulation data generated from scRNA-seq data set based on a combined metric of average and percent expression for the six predictive marker genes. D. ROC analysis performed on LC MN expression data sets (Cox et al., 2010; Highley et al., 2014; Kirby et al., 2011; Krach et al., 2018; Rabin et al., 2010). E. ROC analysis performed on mouse LCM MNs (Nardo et al., 2013). F. ROC analysis performed on bulk RNA-seq data sets generated by the NeuroLINCS Consortium for undifferentiated iPSCs (day 0), two MN differentiation protocols at days 18 and 90, and iPSC-MNs from Fujimori et al., 2018; Kiskinis et al., and 2014; Shi et al., 2018. G. ROC analysis performed on bulk proteomics data sets generated by the NeuroLINCS Consortium for two MN differentiation protocols at days 18 and 90. ELAVL3 protein expression is used as the prediction. H - J. Representative images of lumbar spinal cord sections from control, sporadic, and C9orf72 ALS subjects immunostained for ELAVL3 and counterstained with hematoxylin. Scale bar, 2.5 mm for H, I, and J; 250 μm for inset images H’, I’, and J’. K. Comparative distributions of ELAVL3 optical densities. See also Figure S7.

Similar articles

Cited by

References

    1. Alaynick WA, Jessell TM, and Pfaff SL (2011). Snapshot: spinal cord development. Cell 146, 178–178.e1. - PMC - PubMed
    1. Anders S, and Huber W (2010). Differential expression analysis for sequence count data. Genome Biol. 11, R106. - PMC - PubMed
    1. Blondel VD, Guillaume J-L, Lambiotte R, and Lefebvre E (2008). Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp 2008, P10008.
    1. Butler A, Hoffman P, Smibert P, Papalexi E, and Satija R (2018). Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol 36, 411–420. - PMC - PubMed
    1. Büttner M, Miao Z, Wolf FA, Teichmann SA, and Theis FJ (2019). A test metric for assessing single-cell RNA-seq batch correction. Nat. Methods 16, 43–49. - PubMed

Publication types