Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Nov 26:3:3333.
doi: 10.1038/srep03333.

Soluble expression of proteins correlates with a lack of positively-charged surface

Affiliations

Soluble expression of proteins correlates with a lack of positively-charged surface

Pedro Chan et al. Sci Rep. .

Abstract

Prediction of protein solubility is gaining importance with the growing use of protein molecules as therapeutics, and ongoing requirements for high level expression. We have investigated protein surface features that correlate with insolubility. Non-polar surface patches associate to some degree with insolubility, but this is far exceeded by the association with positively-charged patches. Negatively-charged patches do not separate insoluble/soluble subsets. The separation of soluble and insoluble subsets by positive charge clustering (area under the curve for a ROC plot is 0.85) has a striking parallel with the separation that delineates nucleic acid-binding proteins, although most of the insoluble dataset are not known to bind nucleic acid. Additionally, these basic patches are enriched for arginine, relative to lysine. The results are discussed in the context of expression systems and downstream processing, contributing to a view of protein solubility in which the molecular interactions of charged groups are far from equivalent.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Cumulative fractions of soluble (SOL) and insoluble (INS) protein datasets, upon calculation of particular features.
(a) Net charge, predicted at pH 7.0. (b) Grid points within the largest positive (pos) and largest negative (neg) contours of electrostatic potential. (c) Maximum net positive charge in a geometric patch (13 Å radius). (d) The maximum ratio (for each protein) of non-polar to polar patch SASA. (e) Largest positive patch contours are re-plotted, now as a ratio to a 3000 grid point threshold, alongside calculations with DNA-binding and non-DNA-binding datasets. (f) Separation according to the geometrical patch with the largest Arg content.
Figure 2
Figure 2. ROC plots for insoluble and soluble subset separation.
(a) ROC plot (AUC = 0.85) showing separation by positive potential. TPR is true positive rate and FPR false positive rate. (b) ROC plot (AUC = 0.62) quantifying the separation by non-polar to polar surface ratio (13 Å radius patch).
Figure 3
Figure 3. Weak interactions and association in a crowded environment.
(a) Two species interact with an energy of 15 kJ/mole. Concentrations are varied (0 to 4 mM) for protein interacting sites (horizontally) and NA interacting sites (vertically). The heat map shows the proportion of interacting protein sites that are complexed (scale bar under the map). See text for more detail. (b) A hypothetical scheme is drawn in which protein-NA interactions are mediated by charge interactions (upper left), followed by partial unfolding concomitant with NA base – protein interactions (upper right), then protein-protein association through non-polar interactions (lower right), and finally dissociation of protein from NA (lower left).

References

    1. Vendruscolo M., Knowles T. P. & Dobson C. M. Protein solubility and protein homeostasis: a generic view of protein misfolding disorders. Cold Spring Harb Perspect Biol 3, a010454 (2011). - PMC - PubMed
    1. Wilkinson D. L. & Harrison R. G. Predicting the solubility of recombinant proteins in Escherichia coli. Biotechnology (N Y) 9, 443–448 (1991). - PubMed
    1. Idicula-Thomas S. & Balaji P. V. Correlation between the structural stability and aggregation propensity of proteins. In Silico Biol 7, 225–237 (2007). - PubMed
    1. Berman H. M. et al. The Protein Data Bank. Nucleic Acids Res 28, 235–242 (2000). - PMC - PubMed
    1. Smialowski P. et al. Protein solubility: sequence based prediction and experimental verification. Bioinformatics 23, 2536–2542 (2007). - PubMed

Publication types