A two-stage approach for improved prediction of residue contact maps
- PMID: 16573808
- PMCID: PMC1484494
- DOI: 10.1186/1471-2105-7-180
A two-stage approach for improved prediction of residue contact maps
Abstract
Background: Protein topology representations such as residue contact maps are an important intermediate step towards ab initio prediction of protein structure. Although improvements have occurred over the last years, the problem of accurately predicting residue contact maps from primary sequences is still largely unsolved. Among the reasons for this are the unbalanced nature of the problem (with far fewer examples of contacts than non-contacts), the formidable challenge of capturing long-range interactions in the maps, the intrinsic difficulty of mapping one-dimensional input sequences into two-dimensional output maps. In order to alleviate these problems and achieve improved contact map predictions, in this paper we split the task into two stages: the prediction of a map's principal eigenvector (PE) from the primary sequence; the reconstruction of the contact map from the PE and primary sequence. Predicting the PE from the primary sequence consists in mapping a vector into a vector. This task is less complex than mapping vectors directly into two-dimensional matrices since the size of the problem is drastically reduced and so is the scale length of interactions that need to be learned.
Results: We develop architectures composed of ensembles of two-layered bidirectional recurrent neural networks to classify the components of the PE in 2, 3 and 4 classes from protein primary sequence, predicted secondary structure, and hydrophobicity interaction scales. Our predictor, tested on a non redundant set of 2171 proteins, achieves classification performances of up to 72.6%, 16% above a base-line statistical predictor. We design a system for the prediction of contact maps from the predicted PE. Our results show that predicting maps through the PE yields sizeable gains especially for long-range contacts which are particularly critical for accurate protein 3D reconstruction. The final predictor's accuracy on a non-redundant set of 327 targets is 35.4% and 19.8% for minimum contact separations of 12 and 24, respectively, when the top length/5 contacts are selected. On the 11 CASP6 Novel Fold targets we achieve similar accuracies (36.5% and 19.7%). This favourably compares with the best automated predictors at CASP6.
Conclusion: Our final system for contact map prediction achieves state-of-the-art performances, and may provide valuable constraints for improved ab initio prediction of protein structures. A suite of predictors of structural features, including the PE, and PE-based contact maps, is available at http://distill.ucd.ie.
Figures



Similar articles
-
Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks.BMC Struct Biol. 2009 Jan 30;9:5. doi: 10.1186/1472-6807-9-5. BMC Struct Biol. 2009. PMID: 19183478 Free PMC article.
-
Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks.BMC Bioinformatics. 2014 Jan 10;15:6. doi: 10.1186/1471-2105-15-6. BMC Bioinformatics. 2014. PMID: 24410833 Free PMC article.
-
Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins.BMC Bioinformatics. 2006 Sep 5;7:402. doi: 10.1186/1471-2105-7-402. BMC Bioinformatics. 2006. PMID: 16953874 Free PMC article.
-
Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design.Mol Biotechnol. 2025 Mar;67(3):862-884. doi: 10.1007/s12033-024-01119-4. Epub 2024 Mar 18. Mol Biotechnol. 2025. PMID: 38498284 Review.
-
Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms.Genomics Proteomics Bioinformatics. 2023 Oct;21(5):913-925. doi: 10.1016/j.gpb.2022.11.014. Epub 2023 Mar 30. Genomics Proteomics Bioinformatics. 2023. PMID: 37001856 Free PMC article. Review.
Cited by
-
Evaluation of residue-residue contact prediction in CASP10.Proteins. 2014 Feb;82 Suppl 2(0 2):138-53. doi: 10.1002/prot.24340. Epub 2013 Aug 31. Proteins. 2014. PMID: 23760879 Free PMC article.
-
Evaluation of residue-residue contact predictions in CASP9.Proteins. 2011;79 Suppl 10(Suppl 10):119-25. doi: 10.1002/prot.23160. Epub 2011 Sep 17. Proteins. 2011. PMID: 21928322 Free PMC article.
-
Predicting protein contact map using evolutionary and physical constraints by integer programming.Bioinformatics. 2013 Jul 1;29(13):i266-73. doi: 10.1093/bioinformatics/btt211. Bioinformatics. 2013. PMID: 23812992 Free PMC article.
-
SABERTOOTH: protein structural alignment based on a vectorial structure representation.BMC Bioinformatics. 2007 Oct 31;8:425. doi: 10.1186/1471-2105-8-425. BMC Bioinformatics. 2007. PMID: 17974011 Free PMC article.
-
High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABER-TOOTH.BMC Bioinformatics. 2010 May 14;11:251. doi: 10.1186/1471-2105-11-251. BMC Bioinformatics. 2010. PMID: 20470364 Free PMC article.
References
-
- Pollastri G, Baldi P. Prediction of Contact Maps by Recurrent Neural Network Architectures and Hidden Context Propagation from All Four Cardinal Corners. Bioinformatics. 2002;18:S62–S70. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources