Automatic extraction of reliable regions from multiple sequence alignments
- PMID: 17570868
- PMCID: PMC1892097
- DOI: 10.1186/1471-2105-8-S5-S9
Automatic extraction of reliable regions from multiple sequence alignments
Abstract
Background: High quality multiple alignments are crucial in the transfer of annotation from one genome to another. Multiple alignment methods strive to achieve ever increasing levels of average accuracy on benchmark sets while the accuracy of individual alignments is often overlooked.
Results: We have previously developed a method to automatically assess the accuracy and overall difficulty of multiple alignments. This was achieved by a per-residue comparison between alternate alignments of the same sequences. Here we present a key extension to this method, an algorithm to extract similarly aligned regions from several alignments and merge them into a new consensus alignment.
Conclusion: We demonstrate that the fraction of correctly aligned residues within the resulting alignments is increased by 25-100 percent compared to the original input alignments, as only the most reliably aligned parts are considered.
Figures


Similar articles
-
Automatic assessment of alignment quality.Nucleic Acids Res. 2005 Dec 16;33(22):7120-8. doi: 10.1093/nar/gki1020. Print 2005. Nucleic Acids Res. 2005. PMID: 16361270 Free PMC article.
-
OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy.BMC Bioinformatics. 2003 Oct 10;4:47. doi: 10.1186/1471-2105-4-47. BMC Bioinformatics. 2003. PMID: 14552658 Free PMC article.
-
Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments.Protein Sci. 2000 Nov;9(11):2278-84. doi: 10.1110/ps.9.11.2278. Protein Sci. 2000. PMID: 11152139 Free PMC article.
-
[Comparative analysis of primary structure of nucleic acids and proteins].Mol Biol (Mosk). 2004 Jan-Feb;38(1):92-103. Mol Biol (Mosk). 2004. PMID: 15042839 Review. Russian.
-
Who watches the watchmen? An appraisal of benchmarks for multiple sequence alignment.Methods Mol Biol. 2014;1079:59-73. doi: 10.1007/978-1-62703-646-7_4. Methods Mol Biol. 2014. PMID: 24170395 Review.
Cited by
-
AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences.Algorithms Mol Biol. 2010 Jun 2;5:24. doi: 10.1186/1748-7188-5-24. Algorithms Mol Biol. 2010. PMID: 20525162 Free PMC article.
-
Reproducing the manual annotation of multiple sequence alignments using a SVM classifier.Bioinformatics. 2009 Dec 1;25(23):3093-8. doi: 10.1093/bioinformatics/btp552. Epub 2009 Sep 21. Bioinformatics. 2009. PMID: 19770262 Free PMC article.
-
H2r: identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments.BMC Bioinformatics. 2008 Mar 18;9:151. doi: 10.1186/1471-2105-9-151. BMC Bioinformatics. 2008. PMID: 18366663 Free PMC article.
References
-
- Do CB, Mahabhashyam MS, Brudno M, Batzoglou S. ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 2005;15:330–340. doi: 10.1101/gr.2821705. http://www.genome.org/cgi/content/abstract/15/2/330 - DOI - PMC - PubMed
-
- Katoh K, Kuma Ki, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucl Acids Res. 2005;33:511–518. doi: 10.1093/nar/gki198. http://nar.oxfordjournals.org/cgi/content/abstract/33/2/511 - DOI - PMC - PubMed
-
- Lassmann T, Sonnhammer E. Kalign – an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics. 2005;6:298. doi: 10.1186/1471-2105-6-298. http://www.biomedcentral.com/1471-2105/6/298 - DOI - PMC - PubMed
-
- Wallace IM, O'Sullivan O, Higgins DG, Notredame C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucl Acids Res. 2006;34:1692–1699. doi: 10.1093/nar/gkl091. http://nar.oxfordjournals.org/cgi/content/abstract/34/6/1692 - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources