Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 30;25(1):909.
doi: 10.1186/s12864-024-10741-0.

Fine mapping of RNA isoform diversity using an innovative targeted long-read RNA sequencing protocol with novel dedicated bioinformatics pipeline

Affiliations

Fine mapping of RNA isoform diversity using an innovative targeted long-read RNA sequencing protocol with novel dedicated bioinformatics pipeline

Camille Aucouturier et al. BMC Genomics. .

Abstract

Background: Solving the structure of mRNA transcripts is a major challenge for both research and molecular diagnostic purposes. Current approaches based on short-read RNA sequencing and RT-PCR techniques cannot fully explore the complexity of transcript structure. The emergence of third-generation long-read sequencing addresses this problem by solving this sequence directly. However, genes with low expression levels are difficult to study with the whole transcriptome sequencing approach. To fix this technical limitation, we propose a novel method to capture transcripts of a gene panel using a targeted enrichment approach suitable for Pacific Biosciences and Oxford Nanopore Technologies platforms.

Results: We designed a set of probes to capture transcripts of a panel of genes involved in hereditary breast and ovarian cancer syndrome. We present SOSTAR (iSofOrmS annoTAtoR), a versatile pipeline to assemble, quantify and annotate isoforms from long read sequencing using a new tool specially designed for this application. The significant enrichment of transcripts by our capture protocol, together with the SOSTAR annotation, allowed the identification of 1,231 unique transcripts within the gene panel from the eight patients sequenced. The structure of these transcripts was annotated with a resolution of one base relative to a reference transcript. All major alternative splicing events of the BRCA1 and BRCA2 genes described in the literature were found. Complex splicing events such as pseudoexons were correctly annotated. SOSTAR enabled the identification of abnormal transcripts in the positive controls. In addition, a case of unexplained inheritance in a family with a history of breast and ovarian cancer was solved by identifying an SVA retrotransposon in intron 13 of the BRCA1 gene.

Conclusions: We have validated a new protocol for the enrichment of transcripts of interest using probes adapted to the ONT and PacBio platforms. This protocol allows a complete description of the alternative structures of transcripts, the estimation of their expression and the identification of aberrant transcripts in a single experiment. This proof-of-concept opens new possibilities for RNA structure exploration in both research and molecular diagnostics.

Keywords: Automatic annotation; HBOC; Isoform assembly; Long read sequencing; RNA splicing.

PubMed Disclaimer

Conflict of interest statement

All authors except N.S. and D.B. declare that they have no competing interests. N.S. is employed by SeqOne Genomics for the time period October 2020 to present in the context of a public-private PhD project (CIFRE fellowship #2020/0103) partnership between INSERM and SeqOne Genomics. D.B. is employed by SeqOne Genomics as Head Bioinformatics.

Figures

Fig. 1
Fig. 1
Targeted long read RNA sequencing workflow. (A) Overview of the sequencing protocol from cell lines to isoform assembly (B) Description of the SOSTAR pipeline
Fig. 2
Fig. 2
Overview of the results generated by targeted long read sequencing. (A) Average coverage calculated for the 28 genes in the patient cohort. Values are plotted on a logarithmic scale (B) Percentage of on and off target rates for the 28 genes in the patient cohort (C) Distribution of isoform lengths assembled by the long read pipelines
Fig. 3
Fig. 3
Comparison of splicing junctions between SRS and LRS for the whole gene panel. (A) Venn diagram of splicing junctions detected by SRS and LRS (B) Violin plot of SR mean read counts of common junctions between SRS and LRS junctions. Values are plotted on a logarithmic scale
Fig. 4
Fig. 4
Expression values of common splicing junctions between short and long read sequencing (SOSTAR isoforms). Values are plotted on a logarithmic scale. Junction are represented by types: 5AS = alternative 5’ splice site / 3AS = alternative 3’ splice site / Physio = physiological junction / SkipEx = exon skipping
Fig. 5
Fig. 5
Investigation of alternative splicing events in BRCA1 and BRCA2 genes. (A) Matrix of pairwise combinations of BRCA1 splicing events within the long reads (B) Matrix of pairwise combinations of BRCA2 splicing events counts within the long reads (C) Circos plot of common junctions between SR and LR junctions for BRCA1 gene (D) Circos plot of common junctions between SR and LR junctions for BRCA2 gene. Values are plotted on a logarithmic scale. The * represents the splicing events previously described by (Colombo et al., 2014) for BRCA1 gene and (Fackenthal et al., 2016) for BRCA2 gene. The thickness of the line reflects the number of long reads supporting the two junctions
Fig. 6
Fig. 6
Validation on positive controls. (A) Bam file from the LRS of the proband carrying the intronic retention in intron 15 of BRCA1 (B) RT-PCR, gel electrophoresis SM: molecular weight size marker (C) Capillary electrophoresis of RT-PCR products (D) Isoform structures of fragments obtained in C. black boxes: exons; green boxes: novel exons, black thin lines: introns; red thick line: deletion; red lines: splicing junctions; purple arrows: RT-PCR primers (E) Bam file from LRS of the proband carrying the exon 8 duplication of BRCA1 (F) RT-PCR, gel electrophoresis (G) Capillary electrophoresis of the RT-PCR products (H) Isoform structures of the fragments obtained in G
Fig. 7
Fig. 7
An unexplained hereditary case. (A) Family pedigree of a family with breast and ovarian cancers. Black symbols indicate patients with breast (B) or ovarian (Ov) cancers. Ages are given as age at diagnosis for cancer patients and current age for living probands. The red ‘+’ indicates the SVA retrotransposon insertion, the red ‘-’ indicates the normal sequence at intron 13 (B) Bam alignment file, displayed on IGV software, from ONT long read sequencing showing the pseudo exon and the SVA retrotransposon insertion in intron 13 of BRAC1 gene with the detailed schematic structure of the aberrant isoform (C) RT-PCR gel electrophoresis showing an insertion of approximately 1000 bp in cancer probands (III.1; III.2) compared to two controls (T-)

References

    1. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. ‘Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing’, Nat. Genet., vol. 40, no. 12, pp. 1413–1415, Dec. 2008, 10.1038/ng.259 - PubMed
    1. Navaratnam DS, Bell TJ, Tu TD, Cohen EL, Oberholtzer JC. ‘Differential Distribution of Ca2+-Activated K + Channel Splice Variants among Hair Cells along the Tonotopic Axis of the Chick Cochlea’, Neuron, vol. 19, no. 5, pp. 1077–1085, Nov. 1997, 10.1016/S0896-6273(00)80398-0 - PubMed
    1. Rosenblatt KP, Sun Z-P, Heller S, Hudspeth AJ. ‘Distribution of Ca2+-Activated K + Channel Isoforms along the Tonotopic Gradient of the Chicken’s Cochlea’, Neuron, vol. 19, no. 5, pp. 1061–1075, Nov. 1997, 10.1016/S0896-6273(00)80397-9 - PubMed
    1. Bonnal SC, López-Oreja I, Valcárcel J. Roles and mechanisms of alternative splicing in cancer — implications for care. Nat Rev Clin Oncol. 2020;17. 10.1038/s41571-020-0350-x. 8, Art. 8, Aug. - PubMed
    1. Park E, Pan Z, Zhang Z, Lin L, Xing Y. The Expanding Landscape of Alternative Splicing Variation in Human populations. Am J Hum Genet. Jan. 2018;102(1):11–26. 10.1016/j.ajhg.2017.11.002. - PMC - PubMed

MeSH terms

LinkOut - more resources