Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 23;26(1):298.
doi: 10.1186/s13059-025-03724-1.

Long-read sequencing reveals the RNA isoform repertoire of neuropsychiatric risk genes in human brain

Affiliations

Long-read sequencing reveals the RNA isoform repertoire of neuropsychiatric risk genes in human brain

Ricardo De Paoli-Iseppi et al. Genome Biol. .

Abstract

Background: Neuropsychiatric disorders are highly complex conditions and the risk of developing a disorder has been tied to hundreds of genomic variants that alter the expression and/or RNA isoforms made by risk genes. However, how these genes contribute to disease risk and onset through altered expression and RNA splicing is not well understood.

Results: Combining our new bioinformatic pipeline IsoLamp with nanopore long-read amplicon sequencing, we deeply profile the RNA isoform repertoire of 31 high-confidence neuropsychiatric disorder risk genes in Human brain. We show most risk genes are more complex than previously reported, identifying 363 novel isoforms and 28 novel exons, including isoforms which alter protein domains, and genes such as ATG13 and GATAD2A where most expression was from previously undiscovered isoforms. The greatest isoform diversity is detected in the schizophrenia risk gene ITIH4. Mass spectrometry of brain protein isolates confirms translation of a novel exon skipping event in ITIH4, suggesting a new regulatory mechanism for this gene in the brain.

Conclusions: Our results emphasize the widespread presence of previously undetected RNA and protein isoforms in the human brain and provide an effective approach to address this knowledge gap. Uncovering the isoform repertoire of candidate neuropsychiatric risk genes will underpin future analyses of the functional impact these isoforms have on neuropsychiatric disorders, enabling the translation of genomic findings into a pathophysiological understanding of disease.

Keywords: Brain; Isoform; Long-read; Nanopore; Neuropsychiatric; RNA; Splicing.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Healthy control post-mortem human brain samples were obtained from six consented individuals collected by the Victorian Brain Bank (VBB) and the Human Research Ethics Committee of the University of Melbourne gave ethical approval for this work: #12457 and #28304. Consent for publication: The VBB obtained signed consent for whole-brain donation from either the donor or their next-of-kin in which the signed person states: “I agree that research data gathered from studies may be published providing the donor cannot be identified.” All samples mentioned in this study have been de-identified and, except VBB pathologist CM, authors were blinded to any other individual details beyond those mentioned in the methods. Competing interests: RDP, YP, YY, JG, and MBC have received financial support from Oxford Nanopore Technologies (ONT) to present their findings at scientific conferences. ONT played no role in study design, execution, analysis or publication.

Figures

Fig. 1
Fig. 1
A RNA isoform sequencing of human post-mortem brain. RNA was isolated from frontal cortical regions, caudate (CAUD), and cerebellum (CBM) and converted to cDNA. The coding sequence (black boxes) was amplified using specific forward (Fwd, yellow arrows) and reverse (Rvr, pink arrow) primers generally designed in the 5’ and 3’ UTR regions (grey boxes) to capture as many isoforms as possible. An Oxford Nanopore Technologies (ONT) adaptor sequence (blue box) was incorporated into each primer for sample multiplexing. Samples were then barcoded and pooled to create a single library for long-read sequencing on a GridION. Key: Brodmann Area (BA), barcode (BC). B Isoform discovery with long-read amplicon sequencing (IsoLamp) workflow. A gene specific parameters file (containing chromosome and primer coordinates) was used to align long-reads from each sample (B1-3) against the reference genome (black box) using Minimap2. Known and novel RNA isoforms were identified using Bambu. Identified isoforms are then filtered (grey isoforms) to remove: (1) those not overlapping forward (yellow line) and reverse (pink line) primer positions, ensuring full-length isoform discovery; (2) Lowly expressed isoforms (blue on heatmap, indicated by dashed red line), which do not meet an expression threshold in a specified proportion of samples (both settings user-defined). Filtered known and novel isoforms are then annotated, quantified, and IsoLamp results files generated
Fig. 2
Fig. 2
Benchmarking IsoLamp using spike-in SIRVs. A Precision recall of each tested pipeline with the Complete (N = 69), Insufficient (N = 43) or Over-annotated (N = 100) SIRV references. IsoLamp (black) returned high-quality isoforms from amplicon data of both SIRV5 and 6. B SIRV5 and 6 Spearman correlations between known and observed expression values for each bioinformatic tool using the SIRV complete (green), insufficient (pink) and over (blue) reference annotations
Fig. 3
Fig. 3
Selection of high-confidence candidate MHD risk genes for amplicon sequencing. A Risk genes included in this study classified by the disorder for which they have the highest evidence of association. B Sequenced genes and their evidence levels for each MHD. The evidence count was calculated as the sum of independent analysis types, for example: GWAS; MAGMA; TWAS; SMR; DNA methylation; fine mapping; protein–protein interaction; and targeted validation studies, that supported gene involvement in risk for a particular disorder. #Re-sequencing of a gene from a previous study [36]
Fig. 4
Fig. 4
A The total number of known and novel isoforms identified across 31 risk genes. SQANTI structural categories are known/full splice match (FSM), novel in catalogue (NIC) and novel not in catalogue (NNC). B Proportion of reads (transcript per million, TPM) for each gene as classified by the SQANTI sub-category. C Count of predicted outcomes for novel isoform subcategories. Expasy [48] was used to examine the open reading frame (ORF) of novel isoforms (SQANTI structural category: novel not in catalogue (NNC) or novel in catalogue (NIC)) using the canonical start and stop as a reference. Predictions were categorized as coding if the ORF was retained, nonsense mediated decay (NMD) if a premature termination codon was present and not within 50 nt of the final exon junction, or unknown (unkn) if there was not enough information. Novel isoform SQANTI subcategories (subcat) are at least one novel splice site (ALO), intron retention (IR) and combination of known junctions (COJ) or splice sites (COS)
Fig. 5
Fig. 5
Highly abundant novel isoforms and the predicted mRNA outcome. A, C, E mRNA splice graphs. Dark and Light grey boxes indicate 5’ and 3’ UTR and coding exons respectively. Numbers indicate the coding exon of interest. Orange arcs (pre-mRNA) and boxes (mRNA) indicate novel splicing events. mRNA isoforms depict known isoforms (ENST) against novel (Tx) isoforms; the symbol “..” indicates abbreviated zeroes. B, D IsoVis visualization of isoform structures (center stack) and expression levels (heatmap). Canonical isoform shown at top of stack including exonic mapping of protein domains (purple) and disordered regions (green) A Splice graph of ATG13 highlights the open reading frame (ORF) preserving skipping event of canonical exon 12. B High expression of ATG13 novel transcript 26 (Tx26). C Splice graphs highlighting novel changes in CSMD1 novel transcript 26 (Tx26) and 33 (Tx33) within highlighted pink regions. The ORF retaining skipping event of canonical exon 65 may disrupt a known glycosylation site (black bar), a sushi domain extending from exon 64 (blue) and part of an extracellular domain (green). Tx33 contains a novel splice donor (−8 nt) within exon 21 leading to a premature termination codon (PTC) in exon 22. Dashed Lines indicate continuation of the transcript to 5’ or 3’ coding exons. D Relatively high expression of CSMD1 novel transcripts 26 and 33. E GATAD2A novel transcript Tx17 contained a novel, ORF retaining, skipping event of canonical exon 10 which contains a phosphorylation site (black bar), part of a polar biased region (blue), and overlaps a CpG island (< 300 bp, green). Dashed Lines indicate alternative splicing of 5’UTR exons
Fig. 6
Fig. 6
ITIH4 canonical and novel isoform protein structure predictions. A IsoVis stack of the top seven ITIH4 isoforms sorted by expression. Several novel isoforms contained the novel exon 22 skipping event (red box) including Txs71 and 94. B Canonical isoform (ENST00000266041, UniProt:Q14624) structure prediction indicating 70 kDa (orange) and 35 kDa (blue and pink) chains. C Novel isoform (Tx71) structure prediction indicating a 70 kDa chain (orange), truncated 35 kDa chain (blue), O-glycosylation sites (green), novel splice junction peptide detected using mass spectrometry (red), and novel peptides (pink). Black arrow indicates termination < 50 nt from the final exon junction complex. D AlphaFold per-residue confidence scores (pLDDT) (0–100) for ITIH4 novel transcript 71: very high (> 90, blue), confident (90–70, light-blue), low (70 > 50, yellow), and very low (< 50, orange)
Fig. 7
Fig. 7
NEGR1 splice isoforms in human brain. A NEGR1 mRNA splice graph highlighting validated novel exons 6a and 6b. Dark and Light grey boxes indicate 5’ and 3’ UTR and coding exons respectively. Numbers indicate the coding exon of interest. Orange arcs (pre-mRNA) and boxes (mRNA) indicate novel splicing events/exons. mRNA isoforms depict known isoforms (ENST) against novel (Tx) isoforms; the symbol “..” indicates abbreviated zeroes. In the open reading frame (ORF) retaining Tx1, a GPI anchor (red line) is shown to shift 3’ in the final exon when compared to ENST00000357731. Tx2 encodes a premature termination codon (PTC) within the novel exon. B IsoVis visualization of NEGR1 isoform structures (center stack) and expression levels grouped by brain region (heatmap). Canonical isoform shown at top of stack including exonic Mapping of a 5’ signal peptide (brown) and three immunoglobulin (Ig)-like domains (purple). Canonical 3’ UTR has been trimmed
Fig. 8
Fig. 8
Brain region enriched expression of novel isoforms. A DCC novel transcript 9 had significantly higher TPM in CBM. ANOVA: F = 9.825, DF = 34. B, C DOC2A novel transcripts. Tx8 (B) had significantly higher TPM in CBM. One caudate sample recorded no expression and has been truncated from the plot. ANOVA: F = 1.893, DF = 34. Tx53 (C) had significantly higher TPM in caudate. ANOVA: F = 21.03, DF = 34. Brodmann’s Area (BA), caudate (CAUD), cerebellum (CBM), and temporal cortex (TCX). Ordinary one-way ANOVA Tukey’s multiple comparison adjusted P value: ** = P ≤ 0.01, *** = P ≤ 0.001, **** = P ≤ 0.0001

References

    1. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008;40:1413–5. - PubMed
    1. Kelemen O, Convertini P, Zhang Z, Wen Y, Shen M, Falaleeva M, Stamm S. Function of alternative splicing. Gene. 2013;514:1–30. - PMC - PubMed
    1. Nilsen TW, Graveley BR. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010;463:457–63. - PMC - PubMed
    1. Leung SK, Jeffries AR, Castanho I, Jordan BT, Moore K, Davies JP, Dempster EL, Bray NJ, O’Neill P, Tseng E, et al. Full-length transcript sequencing of human and mouse cerebral cortex identifies widespread isoform diversity and alternative splicing. Cell Reports. 2021;37:110022. - PMC - PubMed
    1. Mazin P, Xiong J, Liu X, Yan Z, Zhang X, Li M, He L, Somel M, Yuan Y, Phoebe Chen YP, et al. Widespread splicing changes in human brain development and aging. Mol Syst Biol. 2013;9:633. - PMC - PubMed

LinkOut - more resources