Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 8;111(8):1673-1699.
doi: 10.1016/j.ajhg.2024.06.018. Epub 2024 Jul 30.

RNA variant assessment using transactivation and transdifferentiation

Collaborators, Affiliations

RNA variant assessment using transactivation and transdifferentiation

Emmylou C Nicolas-Martinez et al. Am J Hum Genet. .

Abstract

Understanding the impact of splicing and nonsense variants on RNA is crucial for the resolution of variant classification as well as their suitability for precision medicine interventions. This is primarily enabled through RNA studies involving transcriptomics followed by targeted assays using RNA isolated from clinically accessible tissues (CATs) such as blood or skin of affected individuals. Insufficient disease gene expression in CATs does however pose a major barrier to RNA based investigations, which we show is relevant to 1,436 Mendelian disease genes. We term these "silent" Mendelian genes (SMGs), the largest portion (36%) of which are associated with neurological disorders. We developed two approaches to induce SMG expression in human dermal fibroblasts (HDFs) to overcome this limitation, including CRISPR-activation-based gene transactivation and fibroblast-to-neuron transdifferentiation. Initial transactivation screens involving 40 SMGs stimulated our development of a highly multiplexed transactivation system culminating in the 6- to 90,000-fold induction of expression of 20/20 (100%) SMGs tested in HDFs. Transdifferentiation of HDFs directly to neurons led to expression of 193/516 (37.4%) of SMGs implicated in neurological disease. The magnitude and isoform diversity of SMG expression following either transactivation or transdifferentiation was comparable to clinically relevant tissues. We apply transdifferentiation and/or gene transactivation combined with short- and long-read RNA sequencing to investigate the impact that variants in USH2A, SCN1A, DMD, and PAK3 have on RNA using HDFs derived from affected individuals. Transactivation and transdifferentiation represent rapid, scalable functional genomic solutions to investigate variants impacting SMGs in the patient cell and genomic context.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests S.T.C. has no paid advisory roles to declare. S.T.C. is a volunteer member of ClinGen Expert Panels: Muscular Dystrophies and Myopathies GCEP and Limb Girdle Muscular Dystrophy VCEP. S.T.C. is named inventor of intellectual property (IP) relating to novel methods and biomarkers to identify DNA variants that alter pre-messenger RNA splicing: (1) PCT no. 2018904348 and (2) Australian Patent no. 2019379868. PCT no. 2019900836. This IP is unrelated to the data and outcomes described within this manuscript.

Figures

None
Graphical abstract
Figure 1
Figure 1
Silent Mendelian genes have restricted tissue expression and are prominently involved in neurological disorders (A) 1,436 Mendelian genes are silent. Analysis of 4,878 Mendelian disease genes (Nijmegen DG Panel 3.2.0) using minimum required sequencing depth (MRSD) identified 1,436 genes that are not sufficiently expressed in whole blood, LCLs, or HDFs for the purpose of conducting robust analysis of mRNA splicing using srRNA-seq. These genes are termed silent Mendelian genes (SMGs). (B) Large numbers of VUS are found in SMGs. From the catalog of VUSs in ClinVar, 22.2% of all are found in SMGs, of which ∼30% are predicted to impact RNA processing (RP; 6.66% of all ClinVar VUSs). (C) SMGs display highly restricted tissue-specific expression. Heatmap showing the level of mRNA expression (TPM) of each of the 1,436 SMGs across 54 different tissues taken from 948 donors (data obtained from GTEx Version 8). (D–G) Phenotypes, disease categories, and biological processes associated with SMGs. (D) Most frequently associated human phenotype ontology (HPO) terms. (E) Top-ranked gene ontology (GO) biological processes (analyzed via ShinyGO 0.77 using whole-genome background, ranked by Fold enrichment and false discovery rate [FDR]). (F–G) Disease types ranked based on (F) their contribution to the number of SMGs or (G) on the proportion of known associated genes that are silent. Disease gene lists referenced from PanelApp Australia (accessed December 12, 2023).
Figure 2
Figure 2
SMGs are conducive to gene transactivation (A) The dCas9-ST-PH-gRNA gene transactivation system. Co-expression of three transgenes results in the assembled transactivation complex on a gene promoter consisting of (1) the enzymatically dCas9 fused to a SunTag array (10 copies of GCN4 epitopes; dCas9-ST), (2) the hybrid p65 and HSF1 (heat shock factor 1) transcriptional transactivation domains (TADs) fused to a single-chain variable fragment (scFv) that recognizes the GCN4 epitope, and (3) the short gRNA, which directs the dCas9-ST-PH complex to the target gene promoter. The dCas9-ST can recruit up to ten copies of the hybrid transactivator P65-HSF1. (B) HEK293T and HDF clonal cell lines stably expressing dCas9-ST-PH. Representative images of HEK293TdCas9−ST-PH and HDFdCas9−ST-PH cell lines showing stable co-expression of transgenes encoding dCas9-ST (as reported by mCherry encoded in cis) and P65-HSF1 (as reported by EGFP encoded in cis). (C–E) Transactivation screen using single-cell transcriptomics. A pooled gRNA expression plasmid library (160 gRNAs; 4 gRNAs per gene, targeting 40 SMGs) was delivered by lentivirus to the stable HEK293TdCas9−ST-PH and HDFdCas9−ST-PH using a low multiplicity of infection to deliver ∼1 gRNA vector per cell. >20,000 cells per cell line were subjected to single-cell Perturb-seq using the 10× Genomics platform. (C) Cells expressing >6 molecules of a given gRNAs species were analyzed for expression of their target gene. p < 0.05 (adjusted for multiple comparisons). (D) In general, the number of gRNAs per cell is positively associated with target gene expression and negatively associated with cell number analyzed. Data are pooled from all four gRNAs per gene. Dark blue lines are the number of cells, light blue lines are transcripts per million (TPM), and x axis is gRNA expression. (E) Expression levels of the 40 targeted SMGs in single cells. Each dot represents the expression of the target gene in a single cell analyzed. (F) Transactivation screen using bulk-cell transcriptomics. The pooled gRNA expression plasmid library was transfected to the stable HEK293TdCas9−ST-PH or transduced into HDFdCas9−ST-PH cells at high efficiency. Isolated RNA was subjected to srRNA-seq. Bar graph showing the transactivation of 40 SMGs (n = 4 biological replicates; p < 0.05, Genewise statistical test).
Figure 3
Figure 3
Robust Transactivation of SMGs in HDFs (A) Comparison of transactivation levels of IL1RN and PCDH19 using single gRNAs versus a multiplex of four gRNAs. Relative gene expression analyzed via real-time qPCR with values normalized to ACTB and expressed relative to the negative control (dCas9-ST-PH-no gRNA). Error bars respresent standard deviation. (B) The dCas9-ST-PH-gRNA complex was engineered across two lentiviral transgenes with fluorescent reporters. (C) Optimized transient delivery of dCas9-ST-PH-gRNA complex to HDFs. Highly efficient lentiviral co-delivery of dCas9-ST and P65-HSF1-gRNA transgenes in three control HDFs. Representative images showing co-expression of p.dCas9-ST transgene (mCherry) and p.p65-HSF-gRNA (eGFP) 72 h after transduction. (D) Co-expression of dCas9-ST-PH complex and 4 gRNAs successfully transactivates expression of many SMGs in HDFs. Bar graph showing the individual transactivation levels of 20 SMGs and IL1RN mediated by co-expression of dCas9-ST-PH complex and four gRNAs. Expression levels (TPM) generated from srRNA-seq (red) and relative gene expression generated from real-time qPCR (blue) with values normalized to ACTB and expressed relative to negative control (dCas9-ST-PH with no gRNA). Error bars respresent standard deviation. Red dotted line corresponds to TPM = 5. (E) SMGs can be robustly activated across multiple experiments and HDFs. Bar graph showing the transactivation of DMD, PAK3, SCN1A, and USH2A mediated by dCas9-ST-PH-gRNA in multiple different HDF lines. The bar graph data presents the mean and standard deviation from the biological replicates, with each dot plot representing a different cell line. Data presented are expression levels (TPM) generated from srRNA-seq (red bars) and relative gene expression generated from real-time qPCR with values normalized to ACTB and expressed relative to the negative control (dCas9-ST-PH with no gRNA; blue bars). (F) Transactivated SMGs expression levels are comparable to endogenous expression levels in CRTs. Violin plots show endogenous expression of a subset of SNGs in the adult cerebral cortex. The blue dots show the expression of the same genes transactivated in HDFs. Data presented are expression levels (TPM) calculated independently for cortex data accessed from GTEx (Version 8) and transactivated HDF data generated from srRNA-seq, respectively. (G) Transactivated genes in HDFs express diverse isoforms. Comparison isoforms expressed in CRTs (extracted from GTEx Version 8) with transactivated HDFs (srRNA-seq). (H) Complex and rare splicing events are observed using transactivation. Sashimi plot displaying complex splicing patterns of PAK3 in iPSC-derived neurons recapitulated following transactivation in HDFs. Only events with read depth greater than 150 are shown. Insert highlights rare isoform containing exon 6 and 7. All reads that map to exon 6 are shown.
Figure 4
Figure 4
Investigation of RNA variants in SMGs using transactivation of HDFs (A) Illustration of USH2A mRNA expression (red) in human adult tissues as reported by the Human Protein Atlas (HPA). (B) Diagram depicts the USH2A variant under investigation. Sashimi plots report USH2A mRNA splicing. Data derived from Oxford Nanopore long read sequencing of RT-PCR amplicons (exons 13–16) produced using RNA isolated following transactivation of USH2A in HDFs derived from healthy control and affected individuals in the presence and absence of cycloheximide (CHX). Arrows in the sashimi plots specify the reads coming from the alleles with pathogenic variant and allele with VUS as segregated by allelic phasing. (C) Illustration of SCN1A mRNA expression (red) in human adult tissues (HPA). (D) Diagram depicts the SCN1A variant under investigation. Sashimi plots report SCN1A mRNA splicing. Data derived from Oxford Nanopore long-read sequencing RT-PCR amplicons (exons 13–17) produced from RNA isolated following transactivation of SCN1A in HDFs derived from healthy control and affected individuals in the presence and absence of CHX. Arrows on the sashimi plot indicates the position of the pathogenic variant. (E) Illustration of DMD mRNA expression (red) in human adult tissues (HPA). (F) Diagram depicts the DMD variant under investigation. Graphs represent relative read depth of reported across DMD exons 2–5 as determined using long read sequencing of RT-PCR amplicons produced from RNA isolated following transactivation of DMD in HDFs derived from healthy control and affected individuals in the presence and absence of CHX. Note read depth is 1.8 times greater (∼double) in exons 3– and 4 only in samples from the affected individual and is not influenced by CHX.
Figure 5
Figure 5
Transdifferentiation of HDFs directly into iNeurons (A) 516 neurological disorder genes are silent. A comparison between silent mendelian genes (SMGs) and a list of 3,000 neurological disorders reveals an overlap of 516 genes. These genes, herein referred to as silent neurological genes (SNGs), are not expressed at sufficient levels in CATs of blood, LCLs, or HDFs to enable analysis of mRNA splicing using srRNA-seq. (B) Schematic of the vector transgene featuring a Tet-On inducible promoter driving overexpression of NEUROG2 and ASCL1 (abbreviated as TNA). In the TNA transgene, the human ubiquitin C (hUbC) promoter drives the expression of Tet-ON encoding the reverse tetracycline-controlled transactivator (rtTA). rtTA binds the TRE-tight promoter when in the presence of doxycycline thus inducing NGN2 and ASCL1 expression. The phosphoglycerate kinase promoter (PGK) drives the constitutive expression of a puromycin-resistance cassette. The TNA transgene can be packaged into lentivirus. (C and D) Fibroblasts transduced with TNA overexpress NERUOG2 and ASCL1 in response to doxycycline treatment. RT-qPCR performed on RNA isolated from 3 control HDF lines transduced with TNA and treated with or without 2 μg/mL doxycycline (dox) for 1, 2, and 5 days (C) ASCL1 expression and (D) NEUROG2 expression. Error bars respresent standard deviation. (E) iNeurons display overt neuronal morphology and express a set of neuronal marker proteins. Immunofluorescent imaging of day 22 control iNeurons: PSA-NCAM (green), TUBB3 (red), NeuN (green), SYN1 (red), DAPI (blue). Scale bars, 50 μm. (F) Principal component analysis (PCA) of srRNA-seq. RNA was collected at day 0, 10, 20, and 26 of transdifferentiation. Experiment done in quadruplicate. Note that ∼70% of the transcriptional variance occurs by day 10 of transdifferentiation. (G) The srRNA-seq analysis reveals that iNeurons express cohorts of neuronal cell and synapse marker genes. Expression is reported as TPM (log10). (H) The transcriptional profile of iNeurons correlates with iPSC-derived neurons. The expression of genes (>1 TPM, n = 11,119 genes) was correlated between iNeurons (day 26, n = 4) and iPSC-derived neurons (day 90 of iPSC neuronal differentiation, n = 1) using Pearson’s correlation (r = 0.805, p < 0.0001). (I and J) Differential gene expression analysis of iNeuron transdifferentiation. The srRNA-seq transdifferentiation data were used to identify differentially expressed genes between HDFs (day 0) and other time points (day 10, 20, and 26) during transdifferentiation and the overlapping genes of each comparison identified. (I) Comparison of upregulated genes. (J) Comparison of downregulated genes. (K) Gene ontology analysis of genes upregulated in iNeurons at day 26 of transdifferentiation performed using ShinyGO 0.77. The highest-ranking GO terms are reported as fold enrichment and the FDR (-log10FDR).
Figure 6
Figure 6
Expression of SNGs in iNeurons (A) 193 SNGs are expressed during the transdifferentiation of HDFs to iNeurons. Querying the list differentially expressed genes identified in cells undergoing transdifferentiation at days 10, 20, and 26 reveals that 193 of the 516 SNG genes are upregulated in iNeurons and with expression >1 TPM in at least one time point analyzed, with 133 of these displaying TPM >5. (B–D) Expression of the 193 SNGs during transdifferentiation of HDFs to iNeurons. (B) Heatmap shows expression of individual genes reported as TPM (log10). (C) Violin plots show significant upregulation of the cohort of 193 SNGs. Expression is reported as the mean TPM from across all four replicates for each time point. Statistical analysis was determined by ordinary one-way ANOVA with Tukey’s multiple comparison test. ∗∗∗∗p < 0.0001. (D) Categorization of the 193 genes as having expression within 0–1 TPM, 1–5 TPM, 5–10 TPM, and 10+ TPM, at each time point. (E) Correlation of expression (TPM) between the 193 iNeuron expressed SNGs at day 26 transdifferentiation and mean expression (TPM) in the human adult frontal cortex. (F) Comparison of expression of 100 of the 193 iNeuron-expressed SNGs with range of expression observed in the adult frontal cortex samples. (G) Comparison of the isoform diversity between SNGs expressed in iNeurons to that of the adult frontal cortex. Adult frontal cortex expression data were extracted from the GTEx database Version 8.
Figure 7
Figure 7
Pathological mechanisms of PAK3 variants resolved using HDF transdifferentiation (A) Diagram of PAK3 cDNA (GenBank: NM_002578.5) and encoded protein structure. PAK3 nonsense variants are in exons 14 and 16, which encode the kinase domain and are downstream of the auto-inhibitory domain. (B) PAK3 expression is predominately limited to the brain and pancreas (red) data from The HPA. (C) PAK3 expression is upregulated during transdifferentiation. Expression of RAC1 family of activated kinases, PAK1, PAK2, and PAK3, during transdifferentiation of HDF to iNeurons. Expression data extracted from srRNA-seq (see Figure 5) and expressed as the mean TPM derived from four replicates per time point. (D and E) HDFs derived from individuals with the PAK3 variants and transduced with TNA transgene overexpress NERUOG2 and ASCL1 in response to doxycycline treatment for 1, 2, and 5 days. Real-time qPCR assessment of (D) ASCL1 mRNA expression and (E) NEUROG2 mRNA expression. Expression is normalized to ACTB expression. (F) iNeurons transdifferentiated from HDFs derived from individuals with PAK3 variants display overt neuronal morphology and express neuronal marker genes. Immunofluorescent imaging of day 22 iNeurons: MAP2 (green), TUBB3 (red), PSA-NCAM (blue), SYN1 (red), DAPI (white). Scale bars, 50 μm. (G) PAK3 variant mRNA expression is reduced in iNeurons. Real-time qPCR of PAK3 expression in day 22 control (n = 3) and nonsense variant iNeurons. Expression normalized to ACTB. (H) PAK3 variant mRNA is subject to NMD. Real-time qPCR of PAK3 mRNA expression in day 22 control (n = 3) and variant iNeurons treated with or without cycloheximide (CHX) for 24 h prior to RNA collection. mRNA expression is reported as the fold change in expression of CHX treated versus non-CHX treated cells. Expression normalized to ACTB. (I) PAK3 variant mRNA is lowly expressed in transactivated HDFs. PAK3 was transactivated in HDFs derived from n = 3 control individuals and individuals with PAK3 variants. Isolated RNA was subjected to real-time qPCR. Expression is normalized to ACTB. (J) PAK3 variant mRNA acquired through transactivation is subject to NMD. Real-time qPCR of PAK3 expression transactivated HDFs treated with or without cycloheximide (CHX) for 24 h prior to collection. Expression is reported as the fold change in expression of CHX-treated versus non-CHX-treated cells. Expression normalized to ACTB. Statistical analysis was determined by ordinary one-way ANOVA with Tukey’s multiple comparison test. Significance set as p < 0.05, ∗∗ <0.01, ∗∗∗p < 0.001, and ∗∗∗∗p < 0.0001. All error bars respresent standard deviation.

References

    1. Boycott K.M., Hartley T., Biesecker L.G., Gibbs R.A., Innes A.M., Riess O., Belmont J., Dunwoodie S.L., Jojic N., Lassmann T., et al. A Diagnosis for All Rare Genetic Diseases: The Horizon and the Next Frontiers. Cell. 2019;177:32–37. doi: 10.1016/j.cell.2019.02.040. - DOI - PubMed
    1. Brnich S.E., Abou Tayoun A.N., Couch F.J., Cutting G.R., Greenblatt M.S., Heinen C.D., Kanavy D.M., Luo X., McNulty S.M., Starita L.M., et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019;12:3. doi: 10.1186/s13073-019-0690-2. - DOI - PMC - PubMed
    1. Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J., Grody W.W., Hegde M., Lyon E., Spector E., et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015;17:405–424. doi: 10.1038/gim.2015.30. - DOI - PMC - PubMed
    1. Consortium G.T. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369:1318–1330. doi: 10.1126/science.aaz1776. - DOI - PMC - PubMed
    1. Abdellaoui A., Yengo L., Verweij K.J.H., Visscher P.M. 15 years of GWAS discovery: Realizing the promise. Am. J. Hum. Genet. 2023;110:179–194. doi: 10.1016/j.ajhg.2022.12.011. - DOI - PMC - PubMed