Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques

doi:10.1093/nar/gks919

. 2013 Jan 7;41(1):e26.

doi: 10.1093/nar/gks919. Epub 2012 Oct 11.

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques

Francisco M Ortuño¹, Olga Valenzuela, Hector Pomares, Fernando Rojas, Javier P Florido, Jose M Urquiza, Ignacio Rojas

Affiliations

PMID: 23066102
PMCID: PMC3592395
DOI: 10.1093/nar/gks919

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques

Francisco M Ortuño et al. Nucleic Acids Res. 2013.

. 2013 Jan 7;41(1):e26.

doi: 10.1093/nar/gks919. Epub 2012 Oct 11.

Authors

Francisco M Ortuño¹, Olga Valenzuela, Hector Pomares, Fernando Rojas, Javier P Florido, Jose M Urquiza, Ignacio Rojas

Affiliation

¹ Department of Computer Architecture and Computer Technology, University of Granada, 18071 Granada, Spain. fortuno@atc.ugr.es

PMID: 23066102
PMCID: PMC3592395
DOI: 10.1093/nar/gks919

Abstract

Multiple sequence alignments (MSAs) have become one of the most studied approaches in bioinformatics to perform other outstanding tasks such as structure prediction, biological function analysis or next-generation sequencing. However, current MSA algorithms do not always provide consistent solutions, since alignments become increasingly difficult when dealing with low similarity sequences. As widely known, these algorithms directly depend on specific features of the sequences, causing relevant influence on the alignment accuracy. Many MSA tools have been recently designed but it is not possible to know in advance which one is the most suitable for a particular set of sequences. In this work, we analyze some of the most used algorithms presented in the bibliography and their dependences on several features. A novel intelligent algorithm based on least square support vector machine is then developed to predict how accurate each alignment could be, depending on its analyzed features. This algorithm is performed with a dataset of 2180 MSAs. The proposed system first estimates the accuracy of possible alignments. The most promising methodologies are then selected in order to align each set of sequences. Since only one selected algorithm is run, the computational time is not excessively increased.

PubMed Disclaimer

Figures

**Figure 1.**
PAcAlCI scheme. The architecture is developed into four modules: input dataset, feature extraction, feature selection and LS-SVM prediction.

**Figure 2.**
Evolution of the MRE. The number of features progressively increases in ascendant relevance order. Training and test errors are shown.

**Figure 3.**
Distribution of relative errors for training and test sets. The corresponding LS-SVM prediction was performed using 10 features.

**Figure 4.**
Distribution of relative errors for training and test sets. Low accuracies were previously filtered to improve the LS-SVM prediction, avoiding prediction with high errors ().

formula image — **Figure 4.**
Distribution of relative errors for training and test sets. Low accuracies were previously filtered to improve the LS-SVM prediction, avoiding prediction with high errors ().

**Figure 5.**
Intersection of suitable and predicted methodologies (Venn diagrams) corresponding to the four alignments whose accuracies are shown in Table 3.

See this image and copyright information in PMC

References

1. Attwood TK, Parry-Smith DJ. Introduction to Bioinformatics. Prentice Hall: Pearson Education; 2002.
1. Pei J. Multiple protein sequence alignment. Curr. Opin. Struct. Biol. 2008;18:382–386. - PubMed
1. Gelly JC, Joseph AP, Srinivasan N, de Brevern AG. iPBA: a tool for protein structure comparison using sequence alignment strategies. Nucleic Acids Res. 2011;39:W18–W23. - PMC - PubMed
1. Wang LS, Leebens-Mack J, Wall PK, Beckmann K, dePamphilis CW, Warnow T. The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE–ACM Trans. Comput. Biol. Bioinform. 2011;8:1108–1119. - PubMed
1. Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 2011;32:661–668. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Attwood TK, Parry-Smith DJ. Introduction to Bioinformatics. Prentice Hall: Pearson Education; 2002.

[2] Attwood TK, Parry-Smith DJ. Introduction to Bioinformatics. Prentice Hall: Pearson Education; 2002.

[3] Pei J. Multiple protein sequence alignment. Curr. Opin. Struct. Biol. 2008;18:382–386. - PubMed

[4] Pei J. Multiple protein sequence alignment. Curr. Opin. Struct. Biol. 2008;18:382–386. - PubMed

[5] Gelly JC, Joseph AP, Srinivasan N, de Brevern AG. iPBA: a tool for protein structure comparison using sequence alignment strategies. Nucleic Acids Res. 2011;39:W18–W23. - PMC - PubMed

[6] Gelly JC, Joseph AP, Srinivasan N, de Brevern AG. iPBA: a tool for protein structure comparison using sequence alignment strategies. Nucleic Acids Res. 2011;39:W18–W23. - PMC - PubMed

[7] Wang LS, Leebens-Mack J, Wall PK, Beckmann K, dePamphilis CW, Warnow T. The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE–ACM Trans. Comput. Biol. Bioinform. 2011;8:1108–1119. - PubMed

[8] Wang LS, Leebens-Mack J, Wall PK, Beckmann K, dePamphilis CW, Warnow T. The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE–ACM Trans. Comput. Biol. Bioinform. 2011;8:1108–1119. - PubMed

[9] Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 2011;32:661–668. - PMC - PubMed

[10] Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 2011;32:661–668. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques

Affiliation

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques

Authors

Affiliation

Abstract

Figures

Similar articles

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Research Materials