Predicting protein crystallization propensity from protein sequence
- PMID: 20177794
- PMCID: PMC3366497
- DOI: 10.1007/s10969-010-9080-0
Predicting protein crystallization propensity from protein sequence
Abstract
The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein's propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for approximately 720 unique proteins that resulted in X-ray structures. The correlation of the protein's iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein's propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site http://bioinformatics.anl.gov/cgi-bin/tools/pdpredictor .
Figures





Similar articles
-
PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.PLoS One. 2014 Aug 22;9(8):e105902. doi: 10.1371/journal.pone.0105902. eCollection 2014. PLoS One. 2014. PMID: 25148528 Free PMC article.
-
SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.Nucleic Acids Res. 2001 Jul 1;29(13):2884-98. doi: 10.1093/nar/29.13.2884. Nucleic Acids Res. 2001. PMID: 11433035 Free PMC article.
-
Crysalis: an integrated server for computational analysis and design of protein crystallization.Sci Rep. 2016 Feb 24;6:21383. doi: 10.1038/srep21383. Sci Rep. 2016. PMID: 26906024 Free PMC article.
-
Protein crystallizability.Methods Mol Biol. 2010;609:385-400. doi: 10.1007/978-1-60327-241-4_22. Methods Mol Biol. 2010. PMID: 20221931 Review.
-
Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.Brief Bioinform. 2018 Sep 28;19(5):838-852. doi: 10.1093/bib/bbx018. Brief Bioinform. 2018. PMID: 28334201 Free PMC article. Review.
Cited by
-
Computational modeling of cyclotides as antimicrobial agents against Neisseria gonorrhoeae PorB porin protein: integration of docking, immune, and molecular dynamics simulations.Front Chem. 2024 Nov 25;12:1493165. doi: 10.3389/fchem.2024.1493165. eCollection 2024. Front Chem. 2024. PMID: 39659871 Free PMC article.
-
High-throughput protein purification and quality assessment for crystallization.Methods. 2011 Sep;55(1):12-28. doi: 10.1016/j.ymeth.2011.07.010. Epub 2011 Aug 31. Methods. 2011. PMID: 21907284 Free PMC article. Review.
-
The "Sticky Patch" Model of Crystallization and Modification of Proteins for Enhanced Crystallizability.Methods Mol Biol. 2017;1607:77-115. doi: 10.1007/978-1-4939-7000-1_4. Methods Mol Biol. 2017. PMID: 28573570 Free PMC article. Review.
-
Analysis of crystallization data in the Protein Data Bank.Acta Crystallogr F Struct Biol Commun. 2015 Oct;71(Pt 10):1228-34. doi: 10.1107/S2053230X15014892. Epub 2015 Sep 23. Acta Crystallogr F Struct Biol Commun. 2015. PMID: 26457511 Free PMC article.
-
Databases, Repositories, and Other Data Resources in Structural Biology.Methods Mol Biol. 2017;1607:643-665. doi: 10.1007/978-1-4939-7000-1_27. Methods Mol Biol. 2017. PMID: 28573593 Free PMC article. Review.
References
-
- Gao X, et al. High-throughput limited proteolysis/mass spectrometry for protein domain elucidation. J Struct Funct Genomics. 2005;6(2–3):129–134. - PubMed
-
- Koth CM, et al. Use of limited proteolysis to identify protein domains suitable for structural analysis. Methods Enzymol. 2003;368:77–84. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources