Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 21;11(1):55.
doi: 10.1186/s13062-016-0159-9.

IPC - Isoelectric Point Calculator

Affiliations

IPC - Isoelectric Point Calculator

Lukasz P Kozlowski. Biol Direct. .

Abstract

Background: Accurate estimation of the isoelectric point (pI) based on the amino acid sequence is useful for many analytical biochemistry and proteomics techniques such as 2-D polyacrylamide gel electrophoresis, or capillary isoelectric focusing used in combination with high-throughput mass spectrometry. Additionally, pI estimation can be helpful during protein crystallization trials.

Results: Here, I present the Isoelectric Point Calculator (IPC), a web service and a standalone program for the accurate estimation of protein and peptide pI using different sets of dissociation constant (pKa) values, including two new computationally optimized pKa sets. According to the presented benchmarks, the newly developed IPC pKa sets outperform previous algorithms by at least 14.9 % for proteins and 0.9 % for peptides (on average, 22.1 % and 59.6 %, respectively), which corresponds to an average error of the pI estimation equal to 0.87 and 0.25 pH units for proteins and peptides, respectively. Moreover, the prediction of pI using the IPC pKa's leads to fewer outliers, i.e., predictions affected by errors greater than a given threshold.

Conclusions: The IPC service is freely available at http://isoelectric.ovh.org Peptide and protein datasets used in the study and the precalculated pI for the PDB and some of the most frequently used proteomes are available for large-scale analysis and future development.

Reviewers: This article was reviewed by Frank Eisenhaber and Zoltán Gáspári.

Keywords: Isoelectric point; Proteomics; pKa dissociation constant.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Correlation of the experimental versus the theoretical isoelectric points for two protein datasets. Data for SWISS-2DPAGE (left panel) and PIP-DB (right panel) are calculated using the EMBOSS pKa set. Outliers are defined as MSE > 3 and are marked in red. Plots correspond to datasets as presented by the authors before cleaning and the removal of duplicates (duplicates are defined as records that have the same sequence but are referred to as separate records in the database). In both databases, the authors reported multiple pI values from different experiments for the same sequences in separate records. In such cases for the current analysis, the average pI was used. The solid line represents the linear regression after removal of the outliers
Fig. 2
Fig. 2
Correlation of the experimental versus theoretical isoelectric points calculated using different pKa sets. Data for the main protein dataset (merged dataset created from SWISS-2DPAGE and PIP-DB). R2 – Pearson correlation before the removal of outliers. R2corr – Pearson correlation after the removal of outliers. Additionally, the linear regression models fitted to predictions with outliers (magenta line) and without outliers (blue line) are shown. Outliers (marked in magenta) are defined as pI predictions with MSE > 3 in comparison to the experimental pI. Other predictions are represented as heat maps according to the density of points. The numbers of outliers for both the training and testing set are shown together. For brevity, only six pKa sets are shown
Fig. 3
Fig. 3
Histograms of the isoelectric points of proteins. Top and middle panels are calculated using the IPC_protein pKa set (in 0.25 pH unit intervals) and represents pI distribution in the SwissProt database, human proteome, Escherichia coli and extreme halophilic archaeon Natrialba magadii. Bottom two panels presents the isoelectric points of the yeast proteome (6,721 proteins) calculated using the EMBOSS pKa set (as presented in the Saccharomyces Genome Database [40]) and the IPC_protein pKa set for comparison
Fig. 4
Fig. 4
Exemplary output of the IPC calculator for the Mycoplasma genitalium G37 proteome (476 proteins). The scatter plot with the predicted isoelectric points versus molecular weight for all proteins is presented at the top. Then, for individual proteins, pI predictions based on different pKa sets are presented alongside the molecular weight and amino acid composition

References

    1. O’Farrell PH. High resolution two-dimensional electrophoresis of proteins. J Biol Chem. 1975;250(10):4007–4021. - PMC - PubMed
    1. Klose J. Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. Humangenetik. 1975;26(3):231–243. - PubMed
    1. Righetti PG, Castagna A, Herbert B, Reymond F, Rossier JS. Prefractionation techniques in proteome analysis. Proteomics. 2003;3(8):1397–1407. doi: 10.1002/pmic.200300472. - DOI - PubMed
    1. Heller M, Ye M, Michel PE, Morier P, Stalder D, Jünger MA, Aebersold R, Reymond F, Rossier JS. Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides. J Proteome Res. 2005;4(6):2273–2282. doi: 10.1021/pr050193v. - DOI - PubMed
    1. Pace CN, Grimsley GR, Scholtz JM. Protein ionizable groups: pK values and their contribution to protein stability and solubility. J Biol Chem. 2009;284(20):13285–13289. doi: 10.1074/jbc.R800080200. - DOI - PMC - PubMed

MeSH terms