Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 4;46(D1):D213-D217.
doi: 10.1093/nar/gkx997.

APPRIS 2017: principal isoforms for multiple gene sets

Affiliations

APPRIS 2017: principal isoforms for multiple gene sets

Jose Manuel Rodriguez et al. Nucleic Acids Res. .

Abstract

The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the 'principal' isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Bar-plots with the percentage of genes identified with the final annotations of APPRIS for the human (A) and mouse (B) species house in database. APPRIS identifies a principal isoform (Pn) for each gene that are tagged with numbers from 1 to 5, with 1 being the most reliable. Isoforms in genes with a unique protein representative (single CDS) are automatically categorized as P1. The APPRIS Database annotates the protein-coding genes in all public sets GENCODE, RefSeq and UniProtKB. In addition, we established a common gene set (Intersection) with the GENCODE, RefSeq, and UniProtKB reference sets.
Figure 2.
Figure 2.
APPRIS annotations for gene GRIFIN. (A) APPRIS results for the three protein-coding variants from the gene reference sets, GENCODE (Ensembl), RefSeq and UniProtKB. APPRIS chooses isoform ENST00000614228+A4D1Z8 as the principal isoform (highlighted in green), which belongs to Ensembl and UniProtKB. A selection based on the 3D structure, the functional domains and the conservation in related species. (B) Alignment for a section of the Pfam galectin family of proteins. The red arrow shows where 8 extra residues in the RefSeq variants would disrupt a region of the galectin functional domain of GRIFIN. (C) The 3D structure of 4LBJ (human galectin-3 CRD) that has 29% identity with variants ENST00000614228+A4D1Z8. The galectins are a family of proteins defined by their binding specificity for β-galactoside sugars (displayed in light yellow spheres). The red arrow shows where the 8 extra residues would have to insert into the structure, breaking a β-sheet.

References

    1. Pan Q., Shai O., Lee L.J., Frey B.J., Blencowe B.J.. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 2008; 40:1413–1415. - PubMed
    1. Wang E.T., Sandberg R., Luo S., Khrebtukova I., Zhang L., Mayr C., Kingsmore S.F., Schroth G.P., Burge C.B.. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008; 456:470–476. - PMC - PubMed
    1. Smith C.W., Valcárcel J.. Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem. Sci. 2000; 25:381–388. - PubMed
    1. Tress M.L., Martelli P.L., Frankish A., Reeves G.A., Wesselink J-J., Yeats C., Olason P.I., Albrecht M., Hegyi H., Giorgetti A. et al. The implications of alternative splicing in the ENCODE protein complement. Proc. Natl. Acad. Sci. U.S.A. 2007; 104:5495–5500. - PMC - PubMed
    1. Tress M.L., Abascal F., Valencia A.. Alternative splicing may not be the key to proteome complexity. Trends Biochem. Sci. 2017; 42:98–110. - PMC - PubMed

Publication types