Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 11;51(22):12069-12075.
doi: 10.1093/nar/gkad970.

Exploiting public databases of genomic variation to quantify evolutionary constraint on the branch point sequence in 30 plant and animal species

Affiliations

Exploiting public databases of genomic variation to quantify evolutionary constraint on the branch point sequence in 30 plant and animal species

Adéla Nosková et al. Nucleic Acids Res. .

Abstract

The branch point sequence is a degenerate intronic heptamer required for the assembly of the spliceosome during pre-mRNA splicing. Disruption of this motif may promote alternative splicing and eventually cause phenotype variation. Despite its functional relevance, the branch point sequence is not included in most genome annotations. Here, we predict branch point sequences in 30 plant and animal species and attempt to quantify their evolutionary constraints using public variant databases. We find an implausible variant distribution in the databases from 16 of 30 examined species. Comparative analysis of variants from whole-genome sequencing shows that variants submitted from exome sequencing or false positive variants are widespread in public databases and cause these irregularities. We then investigate evolutionary constraint with largely unbiased public variant databases in 14 species and find that the fourth and sixth position of the branch point sequence are more constrained than coding nucleotides. Our findings show that public variant databases should be scrutinized for possible biases before they qualify to analyze evolutionary constraint.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
Variation in genomic features quantified using raw and filtered public bovine variant database. Variability of nine bovine genomic features (A, D, G), as well as nucleotide-wise constraint in and around the splice-sites (B, E, H) and predicted branch point sequences (C, F, I) using raw (top panels) and filtered (middle and bottom panels) variant databases. Constraint was quantified relative to average genome-wide variability using all 89 118 442 SNPs (top panels), a subset of 57 875 698 SNPs that did not contain variants only submitted by the COFACTOR_GENOMICS_CFG20140112 project (middle panels), and a subset of 34 551 781 SNPs that contained only SNPs that were submitted at least twice (bottom panels). Red and blue lines denote average genome-wide and exome variability, respectively.
Figure 2.
Figure 2.
Variation in pig, sheep, and goat genomic features quantified through variants from whole-genome sequencing and public databases. Variability of the nine features of the pig (A), sheep (B) and goat genomes (C). Nucleotide-wise variation relative to average genome-wide variability in and around splice sites (D) and branch point sequence (E).
Figure 3.
Figure 3.
Variation in genomic features across 11 species quantified using public variant databases. Boxplots of nucleotide-wise variability relative to average genome-wide variability in and around splice sites (A) and predicted branch point sequences (C). Violin plots of variability in nine genomic features. Means and medians are indicated with black and white circles, respectively (B).

Similar articles

Cited by

References

    1. Lee Y., Rio D.C.. Mechanisms and regulation of alternative pre-mRNA splicing. Annu. Rev. Biochem. 2015; 84:291. - PMC - PubMed
    1. Keller E.B., Noon W.A.. Intron splicing: a conserved internal signal in introns of animal pre-mRNAs. Proc. Natl. Acad. Sci. U.S.A. 1984; 81:7417. - PMC - PubMed
    1. Taggart A.J., Lin C.L., Shrestha B., Heintzelman C., Kim S., Fairbrother W.G.. Large-scale analysis of branchpoint usage across species and cell lines. Genome Res. 2017; 27:639–649. - PMC - PubMed
    1. Schwartz S.H., Silva J., Burstein D., Pupko T., Eyras E., Ast G.. Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes. Genome Res. 2008; 18:88. - PMC - PubMed
    1. Zhang X., Zhang Y., Wang T., Li Z., Cheng J., Ge H., Tang Q., Chen K., Liu L., Lu C.et al.. A comprehensive map of intron branchpoints and lariat RNAs in plants. Plant Cell. 2019; 31:956. - PMC - PubMed