Optimal data collection for correlated mutation analysis

Haim Ashkenazy¹, Ron Unger, Yossef Kliger

Affiliations

PMID: 18655065
DOI: 10.1002/prot.22168

Optimal data collection for correlated mutation analysis

Haim Ashkenazy et al. Proteins. 2009.

. 2009 Feb 15;74(3):545-55.

doi: 10.1002/prot.22168.

Authors

Haim Ashkenazy¹, Ron Unger, Yossef Kliger

Affiliation

¹ Compugen LTD, Tel Aviv 69512, Israel.

PMID: 18655065
DOI: 10.1002/prot.22168

Abstract

The main objective of correlated mutation analysis (CMA) is to predict intraprotein residue-residue interactions from sequence alone. Despite considerable progress in algorithms and computer capabilities, the performance of CMA methods remains quite low. Here we examine whether, and to what extent, the quality of CMA methods depends on the sequences that are included in the multiple sequence alignment (MSA). The results revealed a strong correlation between the number of homologs in an MSA and CMA prediction strength. Furthermore, many of the current methods include only orthologs in the MSA, we found that it is beneficial to include both orthologs and paralogs in the MSA. Remarkably, even remote homologs contribute to the improved accuracy. Based on our findings we put forward an automated data collection procedure, with a minimal coverage of 50% between the query protein and its orthologs and paralogs. This procedure improves accuracy even in the absence of manual curation. In this era of massive sequencing and exploding sequence data, our results suggest that correlated mutation-based methods have not reached their inherent performance limitations and that the role of CMA in structural biology is far from being fulfilled.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Wiley
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Optimal data collection for correlated mutation analysis

Affiliation

Optimal data collection for correlated mutation analysis

Authors

Affiliation

Abstract

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources