Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 3;13(13):3341.
doi: 10.3390/cancers13133341.

Role of Splicing Regulatory Elements and In Silico Tools Usage in the Identification of Deep Intronic Splicing Variants in Hereditary Breast/Ovarian Cancer Genes

Affiliations

Role of Splicing Regulatory Elements and In Silico Tools Usage in the Identification of Deep Intronic Splicing Variants in Hereditary Breast/Ovarian Cancer Genes

Alejandro Moles-Fernández et al. Cancers (Basel). .

Abstract

The contribution of deep intronic splice-altering variants to hereditary breast and ovarian cancer (HBOC) is unknown. Current computational in silico tools to predict spliceogenic variants leading to pseudoexons have limited efficiency. We assessed the performance of the SpliceAI tool combined with ESRseq scores to identify spliceogenic deep intronic variants by affecting cryptic sites or splicing regulatory elements (SREs) using literature and experimental datasets. Our results with 233 published deep intronic variants showed that SpliceAI, with a 0.05 threshold, predicts spliceogenic deep intronic variants affecting cryptic splice sites, but is less effective in detecting those affecting SREs. Next, we characterized the SRE profiles using ESRseq, showing that pseudoexons are significantly enriched in SRE-enhancers compared to adjacent intronic regions. Although the combination of SpliceAI with ESRseq scores (considering ∆ESRseq and SRE landscape) showed higher sensitivity, the global performance did not improve because of the higher number of false positives. The combination of both tools was tested in a tumor RNA dataset with 207 intronic variants disrupting splicing, showing a sensitivity of 86%. Following the pipeline, five spliceogenic deep intronic variants were experimentally identified from 33 variants in HBOC genes. Overall, our results provide a framework to detect deep intronic variants disrupting splicing.

Keywords: cryptic splice sites; hereditary breast ovarian cancer; in silico prediction tools; pseudoexons; spliceogenic deep intronic variants; splicing regulatory elements.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure 1
Figure 1
Splicing effects caused by deep intronic variants. (A) Normal splicing using natural splicing sites. (B) Deep intronic variant creating/enhancing a cryptic splice site, resulting in the inclusion of a pseudoexon by using a complementary cryptic site. (C) Intronic retention caused by a deep intronic variant that creates/enhances a cryptic site, which is used instead of the canonical splice site. (D) Deep intronic variant creating/enhancing an ISE, resulting in the inclusion of a cryptic exon using two cryptic splice sites. (E) Deep intronic variant disrupting an ISS, resulting in the inclusion of a cryptic exon using two cryptic splice sites.
Figure 2
Figure 2
Definition of SREs abundance in different genomic regions, using the normalized SRE area calculated with ESRseq scores. (A) Normalized SRE area using ESRseq scores of exonic and adjacent 100 intronic nucleotides located upstream and downstream of canonical exons of HBOC and Lynch genes. Significant differences were identified between exons and intronic regions (Pair-wise significance levels calculated by Tukey test, **** p-value < 0.0001). (B) Normalized SRE Area using ESRseq scores of pseudoexons and adjacent 100 intronic nucleotides located upstream and downstream of pseudoexons listed in the literature dataset. Significant differences were identified between pseudoexons and intronic regions (Pair-wise significance levels calculated by Tukey test, **** p-value < 0.0001). (C) Comparison of the exon-intron difference of normalized SRE areas between canonical exons and pseudoexons. First, the mean of normalized SRE area of adjacent donor and acceptor site intronic regions was calculated. Then, this mean was subtracted from the exon and pseudoexon normalized SRE area value. The difference between exonic and intronic regions in canonical exons was higher than in the case of pseudoexons, suggesting that they are more defined by a SRE balance. (t-test, **** p-value < 0.0001). Mean ± standard deviation is represented in each graph. Intron. ACC: intronic sequence adjacent to acceptor site; Intron. DON: intronic sequence adjacent to donor site.
Figure 3
Figure 3
Comparison of the absolute difference of normalized SRE area from 100 nucleotides before and after each deep intronic variant compiled from the literature. Absolute values of normalized SRE area difference from 100 nucleotides upstream and downstream of each variant were used to compare those spliceogenic with those without any effect. Splicing variants (SPL altering) showed higher differences between previous and posterior sequences (t-test, ** p-value ≤ 0.01). Mean ± standard deviation is represented.
Figure 4
Figure 4
Spliceogenic variants characterization in patients’ RNA. For each variant, there is an RT-PCR assay graphical representation, the results of capillary electrophoresis of 6-FAM labelled amplicons and Sanger sequencing to confirm the expression of additional transcripts. (A) The ATM c.1899-123A > G variant activates a cryptic donor site, which is used to yield three different pseudoexons: ▼12A.1, ▼12A.2, and ▼12A.3, each generated as result of the usage of different cryptic acceptor sites (c.1899-174, c.1899-177, and c.1899-213) and the cryptic donor site created by the variant. The ▼12A.1 and ▼12A.2 transcripts were equally expressed, and their abundance was greater than the ▼12A.3. (B) The ATM c.2466 + 1552G > C variant generates the ▼16A additional transcript. This pseudoexon comprises nucleotides from the acceptor site created by the variant and the cryptic donor at c.2466 + 1650. (C) The ATM c.8850 + 2029A > G variant presents an additional transcript (▼61A), from the cryptic acceptor site created by the variant to the cryptic donor at c.8850 + 2131. It was not possible to clearly read the sequence of the aberrant transcript because of its low expression levels, but in the Sanger sequence, we could detect the additional transcript with the insertion since it is marked by the FAM signal at the end of the fragment. (D) The FAM175A c.476 + 156G > T variant leads to the inclusion of a pseudoexon (▼5A), which results in the usage of the cryptic acceptor site activated by the variant and the cryptic donor at c.476 + 252. Its abundance was very low, but the transcript with the insertion was also detected in the Sanger sequence since it is marked by the FAM signal at the end of the fragment. (E) The MUTYH c.998-27G > A variant creates/enhances a cryptic acceptor site which is used instead of the natural acceptor site of exon 12, generating an intronic retention (▼12A transcript).

References

    1. Dorling L., Carvalho S., Allen J., González-Neira A., Luccarini C., Wahlström C., Pooley K.A., Parsons M.T., Fortuno C., Wang Q., et al. Breast Cancer Risk Genes—Association Analysis in More than 113,000 Women. N. Engl. J. Med. 2021;384:428–439. doi: 10.1056/nejmoa1913948. - DOI - PMC - PubMed
    1. Hu C., Hart S.N., Gnanaolivu R., Huang H., Lee K.Y., Na J., Gao C., Lilyquist J., Yadav S., Boddicker N.J., et al. A Population-Based Study of Genes Previously Implicated in Breast Cancer. N. Engl. J. Med. 2021;384:440–451. doi: 10.1056/NEJMoa2005936. - DOI - PMC - PubMed
    1. Hasson S.P., Menes T., Sonnenblick A. Comparison of patient susceptibility genes across breast cancer: Implications for prognosis and therapeutic outcomes. Pharmgenomics Personal. Med. 2020;13:227–238. doi: 10.2147/PGPM.S233485. - DOI - PMC - PubMed
    1. Bonache S., Esteban I., Moles-Fernández A., Tenés A., Duran-Lozano L., Montalban G., Bach V., Carrasco E., Gadea N., López-Fernández A., et al. Multigene panel testing beyond BRCA1/2 in breast/ovarian cancer Spanish families and clinical actionability of findings. J. Cancer Res. Clin. Oncol. 2018;144:2495–2513. doi: 10.1007/s00432-018-2763-9. - DOI - PMC - PubMed
    1. Feliubadaló L., López-Fernández A., Pineda M., Díez O., del Valle J., Gutiérrez-Enríquez S., Teulé A., González S., Stjepanovic N., Salinas M., et al. Opportunistic testing of BRCA1, BRCA2 and mismatch repair genes improves the yield of phenotype driven hereditary cancer gene panels. Int. J. Cancer. 2019;145:2682–2691. doi: 10.1002/ijc.32304. - DOI - PubMed

LinkOut - more resources