The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods

V A Simossis¹, J Heringa

Affiliations

PMID: 15556476
DOI: 10.1016/j.compbiolchem.2004.09.005

The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods

V A Simossis et al. Comput Biol Chem. 2004 Dec.

. 2004 Dec;28(5-6):351-66.

doi: 10.1016/j.compbiolchem.2004.09.005.

Authors

V A Simossis¹, J Heringa

Affiliation

¹ Bioinformatics Section, Faculty of Sciences, Vrije Universiteit, De Boelelaan 1081A, 1081 HV Amsterdam, The Netherlands.

PMID: 15556476
DOI: 10.1016/j.compbiolchem.2004.09.005

Abstract

All currently leading protein secondary structure prediction methods use a multiple protein sequence alignment to predict the secondary structure of the top sequence. In most of these methods, prior to prediction, alignment positions showing a gap in the top sequence are deleted, consequently leading to shrinking of the alignment and loss of position-specific information. In this paper we investigate the effect of this removal of information on secondary structure prediction accuracy. To this end, we have designed SymSSP, an algorithm that post-processes the predicted secondary structure of all sequences in a multiple sequence alignment by (i) making use of the alignment's evolutionary information and (ii) re-introducing most of the information that would otherwise be lost. The post-processed information is then given to a new dynamic programming routine that produces an optimally segmented consensus secondary structure for each of the multiple alignment sequences. We have tested our method on the state-of-the-art secondary structure prediction methods PHD, PROFsec, SSPro2 and JNET using the HOMSTRAD database of reference alignments. Our consensus-deriving dynamic programming strategy is consistently better at improving the segmentation quality of the predictions compared to the commonly used majority voting technique. In addition, we have applied several weighting schemes from the literature to our novel consensus-deriving dynamic programming routine. Finally, we have investigated the level of noise introduced by prediction errors into the consensus and show that predictions of edges of helices and strands are half the time wrong for all the four tested prediction methods.

PubMed Disclaimer

Cited by

Integrity of the P-site is probed during maturation of the 60S ribosomal subunit.
Bussiere C, Hashem Y, Arora S, Frank J, Johnson AW. Bussiere C, et al. J Cell Biol. 2012 Jun 11;197(6):747-59. doi: 10.1083/jcb.201112131. J Cell Biol. 2012. PMID: 22689654 Free PMC article.
A protein structural study based on the centrality analysis of protein sequence feature networks.
Wan X, Tan X. Wan X, et al. PLoS One. 2021 Mar 29;16(3):e0248861. doi: 10.1371/journal.pone.0248861. eCollection 2021. PLoS One. 2021. PMID: 33780482 Free PMC article.
Molecular modeling of the Plasmodium falciparum pre-mRNA splicing and nuclear export factor PfU52.
Newo AN. Newo AN. Protein J. 2014 Aug;33(4):354-68. doi: 10.1007/s10930-014-9566-x. Protein J. 2014. PMID: 24861003
Combine Cryo-EM Density Map and Residue Contact for Protein Structure Prediction - A Case Study.
Alshammari M, He J. Alshammari M, et al. ACM BCB. 2020 Sep;2020:110. doi: 10.1145/3388440.3414708. ACM BCB. 2020. PMID: 35838376 Free PMC article.
In Silico Analysis of a Drosophila Parasitoid Venom Peptide Reveals Prevalence of the Cation-Polar-Cation Clip Motif in Knottin Proteins.
Arguelles J, Lee J, Cardenas LV, Govind S, Singh S. Arguelles J, et al. Pathogens. 2023 Jan 14;12(1):143. doi: 10.3390/pathogens12010143. Pathogens. 2023. PMID: 36678491 Free PMC article.

See all "Cited by" articles

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Elsevier Science
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods

Affiliation

The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods

Authors

Affiliation

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Miscellaneous