Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 10:9:e58178.
doi: 10.7554/eLife.58178.

16p11.2 microdeletion imparts transcriptional alterations in human iPSC-derived models of early neural development

Affiliations

16p11.2 microdeletion imparts transcriptional alterations in human iPSC-derived models of early neural development

Julien G Roth et al. Elife. .

Abstract

Microdeletions and microduplications of the 16p11.2 chromosomal locus are associated with syndromic neurodevelopmental disorders and reciprocal physiological conditions such as macro/microcephaly and high/low body mass index. To facilitate cellular and molecular investigations into these phenotypes, 65 clones of human induced pluripotent stem cells (hiPSCs) were generated from 13 individuals with 16p11.2 copy number variations (CNVs). To ensure these cell lines were suitable for downstream mechanistic investigations, a customizable bioinformatic strategy for the detection of random integration and expression of reprogramming vectors was developed and leveraged towards identifying a subset of 'footprint'-free hiPSC clones. Transcriptomic profiling of cortical neural progenitor cells derived from these hiPSCs identified alterations in gene expression patterns which precede morphological abnormalities reported at later neurodevelopmental stages. Interpreting clinical information-available with the cell lines by request from the Simons Foundation Autism Research Initiative-with this transcriptional data revealed disruptions in gene programs related to both nervous system function and cellular metabolism. As demonstrated by these analyses, this publicly available resource has the potential to serve as a powerful medium for probing the etiology of developmental disorders associated with 16p11.2 CNVs.

Keywords: 16p11.2; copy number variation; corticogenesis; human; iPSC; neurodevelopment; neuroscience; regenerative medicine; stem cells.

PubMed Disclaimer

Conflict of interest statement

JR, KM, AA, VM, HG, YV, SW, CC, JF, KL, RD, TP No competing interests declared

Figures

Figure 1.
Figure 1.. Summary of 16p11.2 CNV clinical features and subject demographics.
See also Supplementary file 1. (A) Microdeletions and microduplications of the 16p11.2 chromosomal region are implicated in a collection of aberrant behavioral, physiological, and morphological conditions. Common conditions associated with each copy number variant are listed here. Red text indicates reciprocal phenotypes. Abbreviations: ADHD, attention-deficit/hyperactivity disorder; ASD, autism spectrum disorder; OCD, obsessive-compulsive disorder; PPD, phonological processing disorder; RELI, receptive-expressive language impairment; DCD, developmental coordination disorder; CHD, congenital heart disease; GERD, gastroesophageal reflux disease; GAD, generalized anxiety disorder; MDD, major depressive disorder. (B) A summary of age, sex, and mutation inheritance information for individuals with the 16p11.2 CNV whose fibroblasts were reprogrammed into hiPSCs. (C) Neuropsychiatric attributes in fibroblast donors. Additional neuropsychiatric information exists for each individual (see SFARI VIP database, Supplementary file 1). Dark gray boxes indicate positive diagnoses, while white boxes represent negative diagnoses.
Figure 2.
Figure 2.. Derivation and validation of 16p11.2 CNV hiPSCs.
See also Supplementary files 1, 2 and 3; Figure 2—figure supplement 1 and Figure 3—figure supplement 1. (A) Schematic of episomal reprogramming of human fibroblasts into hiPSCs. Abbreviations: hFIB, human fibroblasts; MEF, mouse embryonic fibroblasts; KoSR, KnockOut Serum containing media; IF, immunofluorescence; qRT-PCR, quantitative real-time polymerase chain reaction; SNP, single nucleotide polymorphism. (B) qPCR analysis of lateral mesoderm (Hand1), definitive endoderm (Sox17), and neuroectoderm (PAX6) marker expression following directed differentiation into each respective lineage. (C) SNP-based similarity matrix illustrating the degree of familial relatedness across a subset of hiPSC clones. Increased similarity between clones is indicated in red. Family members share a larger number of SNPs (orange) than unrelated individuals (yellow).
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Fibroblast reprogramming and pluripotency validation.
(A) Bright field representative image of Day 18 clone of reprogrammed hiPSC surrounded by MEFs (scale bar, 200 µm). (B) Bright field representative image of hiPSC colonies on Day 28, after manual selection and transfer to feeder-free culture conditions (scale bar, 200 µm). Clone DEL_4_1. (C, D, E, F) All hiPSC clones expressed the pluripotency markers NANOG, OCT3/4, TRA-1–60, and TRA-2–49 (scale bars, 100 µm). Clone DEL_5_7. (G) Extinction of Nanog expression following directed differentiation into endoderm, mesoderm and ectoderm lineages. Log2 of fold change in RNA abundance (differentiated/undifferentiated).
Figure 3.
Figure 3.. hiPSCs differentiate into cortical neural lineages.
See also Supplementary file 2; Figure 4—figure supplement 1. (A) Schematic of neural differentiation of hiPSCs into cortical progenitor cells and neurons utilizing dual SMAD inhibition. Abbreviations: N3, basal neural differentiation medium, PDL-L, Poly-D-Lysine and Laminin coating; R-NPCs, radial NPCs. (B) Day 26 neural rosettes show the typical radially arrayed clusters of neural progenitor cells in brightfield micrographs. Rosettes are composed of PAX6-positive radial glia encircling a NCAD-positive, ZO-1-positive, and aPKCζ-positive apical adherens complex. Cells currently undergoing M-phase of mitosis, indicated by pHH3, are predominately localized around Pericentrin positive centrosomes at the apical end foot of radial glia (scale bars, 50 µm). Representative rosettes are shown from left to right for (WT_8343.2, WT_8343.2, WT_8343.4, and WT_2242.5). (C) Normalized transcript expression levels of neural regionalization candidate genes generated from RNA-Seq data, ordered from rostral to caudal cell fates, followed by general neuronal and non-neural cell fates, and housekeeping genes. Sex and Genotype status are indicated on the left.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. hiPSCs differentiate into NPCs and neurons.
(A) Flow cytometric analysis of the transition between OCT4+ pluripotent hiPSCs and early PAX6+ telencephalic neural progenitor cells. Error bars represent the SEM. Replicates were as follows: WT (3 clones from two donors), DEL (6 clones from three donors). (B) Wild-type (WT), 16p11.2 microdeletion (DEL), and 16p11.2 microduplication (DUP) Day 26 neural rosettes show the typical radially arrayed clusters of neural progenitor cells in brightfield micrographs. Clone IDs for each image: DEL (DEL_5_9, DEL_1_2, DEL_5_8, DEL_9_984), DUP (DUP_3_1, DUP_1_9, DUP_3_3, DUP_1_8), WT (WT_8343.2, WT_8343.2, WT_8343.4, WT_2242.5). (C) Day 45 immature neurons are characterized by long neurites and expression of the neuronal markers TUJ1 and NEUN (scale bar, 50 µm). Clone WT_8343.5.
Figure 4.
Figure 4.. Integration and expression of reprogramming vectors generates pronounced artifacts in the transcriptome.
See also Supplementary files 2 and 4; Figure 4—figure supplements 1 and 2. (A) PCA of variance-stabilized count data before batch correction reveals that samples cluster by integration status within the first two PCs. Axes represent the first two principal components (PC1, PC2). (B) Reprogramming factor expression from reads pseudo aligned to the human genome or to plasmid sequences in Int- and Int+ clones. Y-axis represents estimated counts normalized by size factor. The absence of plasmid-aligned transcripts for most genes is indicated by the absence of dark gray segments for each bar (with the exception of OCT3/4). (C) Percentage of total KLF4 (K), LIN28A (L), MYCL (M), OCT3/4(O), or SOX2 (S) counts pseudo-aligned to plasmid in Int+ clones. Y-axis represents the percentage of counts reported in (B). (D) Heatmap of gene expression represented as Z-scores for the top 100 differentially expressed genes in Int- and Int+ clones as identified with DESeq2. Counts were normalized and scaled using a variance-stabilizing transformation (VST) implemented by DESeq, with batch effect correction using limma. Integration status is visualized on the left (integration free clones on top, light gray indicator). (E) GSEA analysis of DESeq output identified biological functions potentially impacted by cryptic reprogramming vector integration. Individual nodes represent gene lists united by a functional annotation; node size corresponds to the number of genes in pathway, and color reflects whether the pathway is upregulated (purple) or downregulated (blue). Only nodes with significant enrichment in our DESeq output are displayed. The number of genes shared between nodes are indicated by the thickness of their connecting lines. For ease of visualization, individual node labels have been replaced with summary labels for each cluster.
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Presence of 16p11.2 reprogramming vector integration and transcripts.
PCR was used to detect reprogramming vector constructs in genomic DNA, and RT-PCR was used to detect mRNA transcripts from each plasmid. Many of the clones were positive for both OCT4 plasmid DNA and transcript. Some clones were also positive for KLF4 or LIN28 plasmid, but no transcripts were detected. A subset of available clones was reprogrammed using additional plasmids carrying EBNA and p53 sequences. No reprogramming vectors were detected in these lines. Plasmid 1 - pCXLE-hOCT4-shp53-F (Addgene plasmid: 27077), plasmid 2 - pCXLE-hUL (L-MyC and Lin28; Addgene plasmid: 27080), plasmid 3 - pCXLE-hSK (Sox2 and KLF4; Addgene plasmid: 27078). Plasmids 4 - pCE-mp53DD (Addgene plasmid: 41856) and plasmid 5 - pCXB-EBNA1 (Addgene plasmid: 41857). Abbreviations: Int, Vector integrant PCR product; Tr, Vector transcription RT-PCR product; ND, not determined; NA, not applicable.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Influence of OCT3/4 integration on cell phenotype and transcriptome.
(A) Total pHH3 counted in Day 26 neural rosette cultures in Int- and Int+ clones normalized by total cell count. Error bars represent the SEM. Replicates were as follows: Int- (WT: 7 clones over 21 images, DEL: 13 clones over 48 images), Int+ (WT: 1 clone over three images, DEL: 21 clones over 87 images). The means for each condition were Int-, WT: 2.43; Int-, DEL: 2.15; Int+, WT: 0.81; Int+, DEL: 3.18. Significance was determined by paired, two-tailed t-tests (p=0.0156 for DEL, 0.0485 for Int+). (B–E) The first two principal components are plotted and shaded by detection of transcripts against the OCT3/4-bearing plasmid (B), donor genotype (C), donor sex (D), and date of library preparation (E). (F) Normalized transcript expression levels of neural regionalization candidate genes generated from RNA-Seq data, as previously described in Figure 3D. hiPSCs are sub-divided into Int+ (red) and Int- (black).
Figure 5.
Figure 5.. Deletion intervals and differential expression of genes at the 16p11.2 locus.
See also Supplementary files 5 and 6; Figure 5—figure supplements 1, 2, 3 and 4. (A) PCA of variance-stabilized count data after normalization and batch correction reveals that samples cluster by 16p11.2 deletion status within the first two PCs. Axes represent the first two principal components (PC1, PC2). (B) RLG normalized counts of each transcript within the 16p11.2 interval. WT = black symbols, DEL = red symbols, transcripts not detected = gray symbols. (C) Known deletion intervals in integration-free hiPSC clones that were included in the RNA-seq analysis of differentially expressed genes within the 16p11.2 locus. NA, breakpoint information was not available from the Simons Foundation. (D) Canonical gene symbols located between chromosome 16 location 28,800,000 and 30,400,000. Transcripts that reach significance as differentially expressed between WT and DEL clones (FDR < 0.05) are indicated in red. Labels for transcripts that were below detection limits are marked in light gray. (E) Differentially expressed genes that are up- or downregulated at least 1.5-fold. Red lines represent threshold of 1.5-fold change. Genes falling within the 16p11.2 deletion region are highlighted. (F) VST-normalized and batch corrected expression for all genes across all WT clones (X-axis) and DEL clones (Y-axis). Highlighted points represent 16p11.2 region genes that were either called differentially expressed (Red) or not differentially expressed in our pipeline (Orange). (G) Heatmap of gene expression for all the differentially expressed genes identified with DESeq2. Fill values represent counts that have been normalized and scaled using a variance-stabilizing transformation implemented by DESeq, and batch effect corrected using limma and SVA. Sex of the subject is indicated on the left.
Figure 5—figure supplement 1.
Figure 5—figure supplement 1.. Validation of differentially expressed 16p11.2 interval genes.
(A) qPCR validation of the 16p11.2 interval genes KCTD13, TAOK2, MAPK3 and SEZ6L2 showing a reduction in gene expression in all the DEL lines. Data are expressed as a log2 fold change compared to the average expression of these genes across all WT hiPSC lines. (B) Concordance in DE gene fold change among four clones that have a microdeletion in 14q11.1, a microduplication in 14q11.2, and a microdeletion in 7p11.2, relative to remaining clones and relative to the average of all DEL clones combined. Each dot represents the value for a given DE gene. DE genes were ranked in order of highest fold increase to largest decrease.
Figure 5—figure supplement 2.
Figure 5—figure supplement 2.. Concordance of differentially expressed gene changes across individual clones.
Log2(DEL/WT) for normalized counts is plotted for the average across all DEL clones (first panel, red symbols) or individually for each DE gene in each clone. The second panel overlays the average change for all DEL clones (red) on top of all individual values for each DE gene in each DEL clone. Remaining panels plot each DE gene for individual DEL clones. Genes are rank ordered in all plots from largest increase to largest decrease from left to right (the same order is used in all plots). All plots are calculated as DEL values relative to average expression in the combined WT clones.
Figure 5—figure supplement 3.
Figure 5—figure supplement 3.. DAVID gene enrichment analysis for differentially expressed genes.
(A) Disease categories observed in the differentially expressed gene list, ordered by unadjusted p-value, and associated number of genes in each category. Categories that are enriched following Bonferroni correction for multiple hypothesis testing correction are colored dark gray. (B) Gene Ontology (GO) term categories in the differentially expressed gene list, ordered by unadjusted p-value, and associated number of genes in each category.
Figure 5—figure supplement 4.
Figure 5—figure supplement 4.. Differential expression analysis with a linear mixed model to account for shared patient identity across clones.
(A) Differentially expressed genes identified using a limma/voom differential expression pathway that are up- or downregulated at least 1.5-fold. Red lines represent threshold of 1.5-fold change. Genes falling within the 16p11.2 deletion region are highlighted. Genes which were also identified in Figure 5E are denoted by a cross. (B) Heatmap of gene expression for all the differentially expressed genes identified with the limma/voom pipeline. Fill values represent the Z-score of counts per million (CPM)-normalized and batch corrected expression values. Sequencing Batch, Sex, Subject ID, and Genotype are indicated on the left.
Figure 6.
Figure 6.. WGCNA reveals modules of co-expressed genes in integration-free clones that correlate with patient clinical features.
See also Supplementary file 7; Figure 6—figure supplements 1, 2 and 3. (A) Heatmap of p-values assessing the significance of module-trait correlations. Values represent a scaled p-value equal to (−1 * log10(p-value)). P-values that fall outside of the significance threshold of p<0.05 are colored gray. WGCNA-produced module color labels are annotated on the X-axis, with red text indicating 20 modules with p<0.05. (B) Depiction of annotations identified as statistically significant (FDR < 0.25) in GSEA for the set of genes identified by WGCNA as the gene networks within the clinical trait-associated modules with highest significance: pink4, salmon, bisque4, and blue (modules represented in the last four columns of panel A). (C) Categories of pathways identified as upregulated among significantly trait-associated module genes by GSEA according to frequency. Enriched pathways identified by GSEA were assigned to categories based on their Gene Ontology relations. (D) Categories of pathways identified as upregulated among significantly trait-associated module genes by GSEA according to normalized enrichment score (NES). Enriched pathways identified by GSEA were assigned to categories based on their Gene Ontology relations. (E) Heatmap of scaled VST-normalized, batch-corrected expression values for genes identified as members of the pink4 module by WGCNA. Phenotype annotations are indicated on the Y-axis.
Figure 6—figure supplement 1.
Figure 6—figure supplement 1.. Visualization of module membership (MM) and gene-trait significance (GS) for modules with statistical significance.
Scatterplots depicting the relationship between MM and GS for member genes of modules of interest. Significance relative to the trait Weight_Z (left column) and Language (right column) are displayed. The MM-GS correlation and significance reported by WGCNA are depicted on each plot.
Figure 6—figure supplement 2.
Figure 6—figure supplement 2.. Individual sample outliers drive phenotype correlation in three significant modules.
(A) Heatmap of scaled VST-normalized, batch-corrected expression values for genes identified as members of the bisque4 module by WGCNA. Phenotype annotations are indicated on the left. (B) Heatmap of scaled VST-normalized, batch-corrected expression values for genes identified as members of the blue module by WGCNA. Phenotype annotations are indicated on the left. (C) Heatmap of scaled VST-normalized, batch-corrected expression values for genes identified as members of the salmon module by WGCNA. Phenotype annotations are indicated on the left.
Figure 6—figure supplement 3.
Figure 6—figure supplement 3.. Module correlation with donor clinical information.
Heatmap depicting the Pearson correlation between module eigengene and patient trait calculated by WGCNA. Individual modules are indicated on the X-axis through boxes representing their corresponding color.
Author response image 1.
Author response image 1.
Author response image 2.
Author response image 2.
Author response image 3.
Author response image 3.
Author response image 4.
Author response image 4.
Author response image 5.
Author response image 5.
Author response image 6.
Author response image 6.
Author response image 7.
Author response image 7.

Similar articles

Cited by

References

    1. Abrahams BS, Arking DE, Campbell DB, Mefford HC, Morrow EM, Weiss LA, Menashe I, Wadkins T, Banerjee-Basu S, Packer A. SFARI gene 2.0: a community-driven knowledgebase for the autism spectrum disorders (ASDs) Molecular Autism. 2013;4:36. doi: 10.1186/2040-2392-4-36. - DOI - PMC - PubMed
    1. Ang LT, Tan AKY, Autio MI, Goh SH, Choo SH, Lee KL, Tan J, Pan B, Lee JJH, Lum JJ, Lim CYY, Yeo IKX, Wong CJY, Liu M, Oh JLL, Chia CPL, Loh CH, Chen A, Chen Q, Weissman IL, Loh KM, Lim B. A roadmap for human liver differentiation from pluripotent stem cells. Cell Reports. 2018;22:2190–2205. doi: 10.1016/j.celrep.2018.01.087. - DOI - PMC - PubMed
    1. Blackmon K, Thesen T, Green S, Ben-Avi E, Wang X, Fuchs B, Kuzniecky R, Devinsky O. Focal cortical anomalies and language impairment in 16p11.2 Deletion and Duplication Syndrome. Cerebral Cortex. 2018;28:2422–2430. doi: 10.1093/cercor/bhx143. - DOI - PubMed
    1. Blumenthal I, Ragavendran A, Erdin S, Klei L, Sugathan A, Guide JR, Manavalan P, Zhou JQ, Wheeler VC, Levin JZ, Ernst C, Roeder K, Devlin B, Gusella JF, Talkowski ME. Transcriptional consequences of 16p11.2 deletion and duplication in mouse cortex and multiplex autism families. The American Journal of Human Genetics. 2014;94:870–883. doi: 10.1016/j.ajhg.2014.05.004. - DOI - PMC - PubMed
    1. Boyer LA, Lee TI, Cole MF, Johnstone SE, Levine SS, Zucker JP, Guenther MG, Kumar RM, Murray HL, Jenner RG, Gifford DK, Melton DA, Jaenisch R, Young RA. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell. 2005;122:947–956. doi: 10.1016/j.cell.2005.08.020. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

Supplementary concepts

Associated data