Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov;119(44):e2211194119.
doi: 10.1073/pnas.2211194119. Epub 2022 Oct 28.

Genome-wide detection of human variants that disrupt intronic branchpoints

Affiliations

Genome-wide detection of human variants that disrupt intronic branchpoints

Peng Zhang et al. Proc Natl Acad Sci U S A. 2022 Nov.

Abstract

Pre-messenger RNA splicing is initiated with the recognition of a single-nucleotide intronic branchpoint (BP) within a BP motif by spliceosome elements. Forty-eight rare variants in 43 human genes have been reported to alter splicing and cause disease by disrupting BP. However, until now, no computational approach was available to efficiently detect such variants in massively parallel sequencing data. We established a comprehensive human genome-wide BP database by integrating existing BP data and generating new BP data from RNA sequencing of lariat debranching enzyme DBR1-mutated patients and from machine-learning predictions. We characterized multiple features of BP in major and minor introns and found that BP and BP-2 (two nucleotides upstream of BP) positions exhibit a lower rate of variation in human populations and higher evolutionary conservation than the intronic background, while being comparable to the exonic background. We developed BPHunter as a genome-wide computational approach to systematically and efficiently detect intronic variants that may disrupt BP recognition. BPHunter retrospectively identified 40 of the 48 known pathogenic BP variants, in which we summarized a strategy for prioritizing BP variant candidates. The remaining eight variants all create AG-dinucleotides between the BP and acceptor site, which is the likely reason for missplicing. We demonstrated the practical utility of BPHunter prospectively by using it to identify a novel germline heterozygous BP variant of STAT2 in a patient with critical COVID-19 pneumonia and a novel somatic intronic 59-nucleotide deletion of ITPKB in a lymphoma patient, both of which were validated experimentally. BPHunter is publicly available from https://hgidsoft.rockefeller.edu/BPHunter and https://github.com/casanova-lab/BPHunter.

Keywords: branchpoint; disease genetics; intronic variant; software; splicing.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interests.

Figures

Fig. 1.
Fig. 1.
Schematics of a splicing event, the two types of spliceosome, and typical consequences of a BP variant. (A) Schematic of the biological processes of transcription, splicing, and translation from DNA to pre-mRNA to mature mRNA and protein. In pre-mRNA splicing, the BP is first recognized, two exons are then joined, and the intervening intron is finally released as a circular lariat. (B) Schematic of the major (U2-type, on the left) and minor (U12-type, on the right) spliceosomes, with an illustration of the interaction between pre-mRNA sequence and U2/U12 snRNA. (C) Schematic of the potential molecular consequences of a BP variant, including complete/partial exon skipping and complete/partial intron retention. The percentage in parentheses refer to the observed fraction of each category from the published pathogenic BP variants (note that one variant could result in more than one missplicing consequence).
Fig. 2.
Fig. 2.
Integration of all BP datasets and mapping of BP onto introns. (A) The nucleotide composition displays the motif [−9, +3] of BP, where the locations of BP are marked with a red background. (B) BP were mapped to 3′-proximal intron. (C) Distance from the mapped BP to their corresponding 3′ss. (D) The number of BP mapped to each 3′-proximal intron.
Fig. 3.
Fig. 3.
Characterization of BP. (A) Nucleotide composition of BP, 5′ss, and 3′ss motifs in major and minor introns. (B) The distance from BP to 3′ss in major and minor introns. (C) The binding energy between BP motif and U2/U12 snRNA in major and minor introns (the lower the energy, the higher the binding affinity). (D) The proportion of each genomic position harboring population variants (Upper), and the MAF distribution of population variants (Lower). (E) The distribution of the conservation scores GERP (Left) and PhyloP (Right) in each genomic position. (F) The cross-compared conservation scores between BP and BP−2 positions.
Fig. 4.
Fig. 4.
Detection of pathogenic BP variants by BPHunter. (A) Timeline of the 48 published pathogenic BP variants (Left) and the distance to their 3′ss (Right). (B) Disrupted positions of BP. (C) Ranking of the disrupted BP in its 3′-proximal intron. (D) Type of disrupted BP. (E) Level of consensus of the disrupted BP motif (1: YTNAY, 2: YTNA, 3: TNA, and 4: YNA). (F) Number of data sources supporting the disrupted BP. (G) Population variation of the disrupted BP. (H) Conservation scores of the disrupted BP. (I) Missplicing scores and deleteriousness score of the pathogenic BP variants. (J) BPHunter scores of the pathogenic BP variants.
Fig. 5.
Fig. 5.
Detection and validation of a germline heterozygous STAT2 variant which disrupts BP and splicing in a life-threatening COVID-19 patient. (A) Detection of a heterozygous STAT2 variant disrupting BP. (B) STAT2 transcripts and proportions from exon trapping assay in COS-7 cells. (C) STAT2 transcripts and proportions from RNA extracted from whole blood. (D) Ratio of the GTEx-annotated transcripts among the total transcripts. (E) Estimation of STAT2 canonical transcripts based on RT-qPCR, measuring STAT2 mRNA levels in whole blood using two probes spanning intron 5 (probe 5-6) and intron 6 (probe 6-7).
Fig. 6.
Fig. 6.
Detection and validation of a private somatic intronic deletion of ITPKB in a lymphoma patient. (A) Detection of a somatic intronic 59-nt deletion of ITPKB that removed all three BP sites in intron-3. (B) RNA-seq reads alignment on ITPKB that showed intronic retention in the lymphoma patient carrying this variant but not in the patients without this variant. (C) Sashimi plot from the RNA-seq data.

References

    1. Scotti M. M., Swanson M. S., RNA mis-splicing in disease. Nat. Rev. Genet. 17, 19–32 (2016). - PMC - PubMed
    1. Wang G. S., Cooper T. A., Splicing in disease: Disruption of the splicing code and the decoding machinery. Nat. Rev. Genet. 8, 749–761 (2007). - PubMed
    1. Sharp P. A., Splicing of messenger RNA precursors. Science 235, 766–771 (1987). - PubMed
    1. Gao K., Masuda A., Matsuura T., Ohno K., Human branch point consensus sequence is yUnAy. Nucleic Acids Res. 36, 2257–2267 (2008). - PMC - PubMed
    1. Mercer T. R., et al. , Genome-wide discovery of human splicing branchpoints. Genome Res. 25, 290–303 (2015). - PMC - PubMed

Publication types