RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures
- PMID: 22962341
- DOI: 10.1093/bioinformatics/bts550
RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures
Abstract
Motivation: Repeat proteins form a distinct class of structures where folding is greatly simplified. Several classes have been defined, with solenoid repeats of periodicity between ca. 5 and 40 being the most challenging to detect. Such proteins evolve quickly and their periodicity may be rapidly hidden at sequence level. From a structural point of view, finding solenoids may be complicated by the presence of insertions or multiple domains. To the best of our knowledge, no automated methods are available to characterize solenoid repeats from structure.
Results: Here we introduce RAPHAEL, a novel method for the detection of solenoids in protein structures. It reliably solves three problems of increasing difficulty: (1) recognition of solenoid domains, (2) determination of their periodicity and (3) assignment of insertions. RAPHAEL uses a geometric approach mimicking manual classification, producing several numeric parameters that are optimized for maximum performance. The resulting method is very accurate, with 89.5% of solenoid proteins and 97.2% of non-solenoid proteins correctly classified. RAPHAEL periodicities have a Spearman correlation coefficient of 0.877 against the manually established ones. A baseline algorithm for insertion detection in identified solenoids has a Q(2) value of 79.8%, suggesting room for further improvement. RAPHAEL finds 1931 highly confident repeat structures not previously annotated as solenoids in the Protein Data Bank records.
Similar articles
-
REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform.Bioinformatics. 2009 Jun 15;25(12):i289-95. doi: 10.1093/bioinformatics/btp232. Bioinformatics. 2009. PMID: 19478001 Free PMC article.
-
ConSole: using modularity of contact maps to locate solenoid domains in protein structures.BMC Bioinformatics. 2014 Apr 27;15:119. doi: 10.1186/1471-2105-15-119. BMC Bioinformatics. 2014. PMID: 24766872 Free PMC article.
-
Revealing aperiodic aspects of solenoid proteins from sequence information.Bioinformatics. 2016 Sep 15;32(18):2776-82. doi: 10.1093/bioinformatics/btw319. Epub 2016 Jun 9. Bioinformatics. 2016. PMID: 27334472 Free PMC article.
-
Beta-rolls, beta-helices, and other beta-solenoid proteins.Adv Protein Chem. 2006;73:55-96. doi: 10.1016/S0065-3233(06)73003-0. Adv Protein Chem. 2006. PMID: 17190611 Review.
-
When protein folding is simplified to protein coiling: the continuum of solenoid protein structures.Trends Biochem Sci. 2000 Oct;25(10):509-15. doi: 10.1016/s0968-0004(00)01667-4. Trends Biochem Sci. 2000. PMID: 11050437 Review.
Cited by
-
Search for Highly Divergent Tandem Repeats in Amino Acid Sequences.Int J Mol Sci. 2021 Jul 1;22(13):7096. doi: 10.3390/ijms22137096. Int J Mol Sci. 2021. PMID: 34281150 Free PMC article.
-
Detecting repetitions and periodicities in proteins by tiling the structural space.J Phys Chem B. 2013 Oct 24;117(42):12887-97. doi: 10.1021/jp402105j. Epub 2013 Jul 5. J Phys Chem B. 2013. PMID: 23758291 Free PMC article.
-
RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures.Nucleic Acids Res. 2021 Jan 8;49(D1):D452-D457. doi: 10.1093/nar/gkaa1097. Nucleic Acids Res. 2021. PMID: 33237313 Free PMC article.
-
A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes.Sci Rep. 2018 Nov 13;8(1):16757. doi: 10.1038/s41598-018-35033-y. Sci Rep. 2018. PMID: 30425279 Free PMC article.
-
MemSTATS: A Benchmark Set of Membrane Protein Symmetries and Pseudosymmetries.J Mol Biol. 2020 Jan 17;432(2):597-604. doi: 10.1016/j.jmb.2019.09.020. Epub 2019 Oct 16. J Mol Biol. 2020. PMID: 31628944 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases