PEPPI: a peptidomic database of human protein isoforms for proteomics experiments
- PMID: 20946618
- PMCID: PMC3026381
- DOI: 10.1186/1471-2105-11-S6-S7
PEPPI: a peptidomic database of human protein isoforms for proteomics experiments
Abstract
Background: Protein isoform generation, which may derive from alternative splicing, genetic polymorphism, and posttranslational modification, is an essential source of achieving molecular diversity by eukaryotic cells. Previous studies have shown that protein isoforms play critical roles in disease diagnosis, risk assessment, sub-typing, prognosis, and treatment outcome predictions. Understanding the types, presence, and abundance of different protein isoforms in different cellular and physiological conditions is a major task in functional proteomics, and may pave ways to molecular biomarker discovery of human diseases. In tandem mass spectrometry (MS/MS) based proteomics analysis, peptide peaks with exact matches to protein sequence records in the proteomics database may be identified with mass spectrometry (MS) search software. However, due to limited annotation and poor coverage of protein isoforms in proteomics databases, high throughput protein isoform identifications, particularly those arising from alternative splicing and genetic polymorphism, have not been possible.
Results: Therefore, we present the PEPtidomics Protein Isoform Database (PEPPI, http://bio.informatics.iupui.edu/peppi), a comprehensive database of computationally-synthesized human peptides that can identify protein isoforms derived from either alternatively spliced mRNA transcripts or SNP variations. We collected genome, pre-mRNA alternative splicing and SNP information from Ensembl. We synthesized in silico isoform transcripts that cover all exons and theoretically possible junctions of exons and introns, as well as all their variations derived from known SNPs. With three case studies, we further demonstrated that the database can help researchers discover and characterize new protein isoform biomarkers from experimental proteomics data.
Conclusions: We developed a new tool for the proteomics community to characterize protein isoforms from MS-based proteomics experiments. By cataloguing each peptide configurations in the PEPPI database, users can study genetic variations and alternative splicing events at the proteome level. They can also batch-download peptide sequences in FASTA format to search for MS/MS spectra derived from human samples. The database can help generate novel hypotheses on molecular risk factors and molecular mechanisms of complex diseases, leading to identification of potentially highly specific protein isoform biomarkers.
Figures








Similar articles
-
Identification of novel alternative splicing biomarkers for breast cancer with LC/MS/MS and RNA-Seq.BMC Bioinformatics. 2020 Dec 3;21(Suppl 9):541. doi: 10.1186/s12859-020-03824-8. BMC Bioinformatics. 2020. PMID: 33272210 Free PMC article.
-
A method for identifying discriminative isoform-specific peptides for clinical proteomics application.BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):522. doi: 10.1186/s12864-016-2907-8. BMC Genomics. 2016. PMID: 27557076 Free PMC article.
-
SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics.BMC Bioinformatics. 2013;14 Suppl 14(Suppl 14):S13. doi: 10.1186/1471-2105-14-S14-S13. Epub 2013 Oct 9. BMC Bioinformatics. 2013. PMID: 24267658 Free PMC article.
-
Alternative Splicing May Not Be the Key to Proteome Complexity.Trends Biochem Sci. 2017 Feb;42(2):98-110. doi: 10.1016/j.tibs.2016.08.008. Epub 2016 Oct 3. Trends Biochem Sci. 2017. PMID: 27712956 Free PMC article. Review.
-
Informatics for protein identification by mass spectrometry.Methods. 2005 Mar;35(3):223-36. doi: 10.1016/j.ymeth.2004.08.014. Epub 2005 Jan 13. Methods. 2005. PMID: 15722219 Review.
Cited by
-
Function, clinical application, and strategies of Pre-mRNA splicing in cancer.Cell Death Differ. 2019 Jul;26(7):1181-1194. doi: 10.1038/s41418-018-0231-3. Epub 2018 Nov 21. Cell Death Differ. 2019. PMID: 30464224 Free PMC article. Review.
-
Identification of novel alternative splicing biomarkers for breast cancer with LC/MS/MS and RNA-Seq.BMC Bioinformatics. 2020 Dec 3;21(Suppl 9):541. doi: 10.1186/s12859-020-03824-8. BMC Bioinformatics. 2020. PMID: 33272210 Free PMC article.
-
Impact of Alternative Splicing on the Human Proteome.Cell Rep. 2017 Aug 1;20(5):1229-1241. doi: 10.1016/j.celrep.2017.07.025. Cell Rep. 2017. PMID: 28768205 Free PMC article.
-
Proceedings of the 2011 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference. Introduction.BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S1. doi: 10.1186/1471-2105-12-S10-S1. BMC Bioinformatics. 2011. PMID: 22165918 Free PMC article. No abstract available.
-
IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis.BMC Bioinformatics. 2012;13 Suppl 15(Suppl 15):S7. doi: 10.1186/1471-2105-13-S15-S7. Epub 2012 Sep 11. BMC Bioinformatics. 2012. PMID: 23046449 Free PMC article.
References
-
- Lixia M, Zhijian C, Chao S, Chaojiang G, Congyi Z. Alternative splicing of breast cancer associated gene BRCA1 from breast cancer cell line. J Biochem Mol Biol. 2007;40(1):15–21. - PubMed
-
- Ogawa T, Shiga K, Hashimoto S, Kobayashi T, Horii A, Furukawa T. APAF-1-ALT, a novel alternative splicing form of APAF-1, potentially causes impeded ability of undergoing DNA damage-induced apoptosis in the LNCaP human prostate cancer cell line. Biochem Biophys Res Commun. 2003;306(2):537–543. doi: 10.1016/S0006-291X(03)00995-1. - DOI - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources