Comparative assessment of methods for aligning multiple genome sequences
- PMID: 20495551
- DOI: 10.1038/nbt.1637
Comparative assessment of methods for aligning multiple genome sequences
Abstract
Multiple sequence alignment is a difficult computational problem. There have been compelling pleas for methods to assess whole-genome multiple sequence alignments and compare the alignments produced by different tools. We assess the four ENCODE alignments, each of which aligns 28 vertebrates on 554 Mbp of total input sequence. We measure the level of agreement among the alignments and compare their coverage and accuracy. We find a disturbing lack of agreement among the alignments not only in species distant from human, but even in mouse, a well-studied model organism. Overall, the assessment shows that Pecan produces the most accurate or nearly most accurate alignment in all species and genomic location categories, while still providing coverage comparable to or better than that of the other alignments in the placental mammals. Our assessment reveals that constructing accurate whole-genome multiple sequence alignments remains a significant challenge, particularly for noncoding regions and distantly related species.
Similar articles
-
How accurately is ncRNA aligned within whole-genome multiple alignments?BMC Bioinformatics. 2007 Oct 26;8:417. doi: 10.1186/1471-2105-8-417. BMC Bioinformatics. 2007. PMID: 17963514 Free PMC article.
-
Multiple whole-genome alignments without a reference organism.Genome Res. 2009 Apr;19(4):682-9. doi: 10.1101/gr.081778.108. Epub 2009 Jan 28. Genome Res. 2009. PMID: 19176791 Free PMC article.
-
Increased alignment sensitivity improves the usage of genome alignments for comparative gene annotation.Nucleic Acids Res. 2017 Aug 21;45(14):8369-8377. doi: 10.1093/nar/gkx554. Nucleic Acids Res. 2017. PMID: 28645144 Free PMC article.
-
Differences between pair-wise and multi-sequence alignment methods affect vertebrate genome comparisons.Trends Genet. 2006 Apr;22(4):187-93. doi: 10.1016/j.tig.2006.02.005. Epub 2006 Feb 24. Trends Genet. 2006. PMID: 16499991 Review.
-
Computation and analysis of genomic multi-sequence alignments.Annu Rev Genomics Hum Genet. 2007;8:193-213. doi: 10.1146/annurev.genom.8.080706.092300. Annu Rev Genomics Hum Genet. 2007. PMID: 17489682 Review.
Cited by
-
Comparative analysis of the primate X-inactivation center region and reconstruction of the ancestral primate XIST locus.Genome Res. 2011 Jun;21(6):850-62. doi: 10.1101/gr.111849.110. Epub 2011 Apr 25. Genome Res. 2011. PMID: 21518738 Free PMC article.
-
GALA: a computational framework for de novo chromosome-by-chromosome assembly with long reads.Nat Commun. 2023 Jan 13;14(1):204. doi: 10.1038/s41467-022-35670-y. Nat Commun. 2023. PMID: 36639368 Free PMC article.
-
Identifying functional single nucleotide polymorphisms in the human CArGome.Physiol Genomics. 2011 Sep 22;43(18):1038-48. doi: 10.1152/physiolgenomics.00098.2011. Epub 2011 Jul 19. Physiol Genomics. 2011. PMID: 21771879 Free PMC article.
-
Systematic discovery of conservation states for single-nucleotide annotation of the human genome.Commun Biol. 2019 Jul 2;2:248. doi: 10.1038/s42003-019-0488-1. eCollection 2019. Commun Biol. 2019. PMID: 31286065 Free PMC article.
-
Coordinate systems for supergenomes.Algorithms Mol Biol. 2018 Sep 24;13:15. doi: 10.1186/s13015-018-0133-4. eCollection 2018. Algorithms Mol Biol. 2018. PMID: 30258487 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous