MAVID: constrained ancestral alignment of multiple sequences
- PMID: 15060012
- PMCID: PMC383315
- DOI: 10.1101/gr.1960404
MAVID: constrained ancestral alignment of multiple sequences
Abstract
We describe a new global multiple-alignment program capable of aligning a large number of genomic regions. Our progressive-alignment approach incorporates the following ideas: maximum-likelihood inference of ancestral sequences, automatic guide-tree construction, protein-based anchoring of ab-initio gene predictions, and constraints derived from a global homology map of the sequences. We have implemented these ideas in the MAVID program, which is able to accurately align multiple genomic regions up to megabases long. MAVID is able to effectively align divergent sequences, as well as incomplete unfinished sequences. We demonstrate the capabilities of the program on the benchmark CFTR region, which consists of 1.8 Mb of human sequence and 20 orthologous regions in marsupials, birds, fish, and mammals. Finally, we describe two large MAVID alignments, an alignment of all the available HIV genomes and a multiple alignment of the entire human, mouse, and rat genomes.
Figures




Similar articles
-
Aligning multiple genomic sequences with the threaded blockset aligner.Genome Res. 2004 Apr;14(4):708-15. doi: 10.1101/gr.1933104. Genome Res. 2004. PMID: 15060014 Free PMC article.
-
LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.Genome Res. 2003 Apr;13(4):721-31. doi: 10.1101/gr.926603. Epub 2003 Mar 12. Genome Res. 2003. PMID: 12654723 Free PMC article.
-
MAVID multiple alignment server.Nucleic Acids Res. 2003 Jul 1;31(13):3525-6. doi: 10.1093/nar/gkg623. Nucleic Acids Res. 2003. PMID: 12824358 Free PMC article.
-
Computation and analysis of genomic multi-sequence alignments.Annu Rev Genomics Hum Genet. 2007;8:193-213. doi: 10.1146/annurev.genom.8.080706.092300. Annu Rev Genomics Hum Genet. 2007. PMID: 17489682 Review.
-
Evolution at the nucleotide level: the problem of multiple whole-genome alignment.Hum Mol Genet. 2006 Apr 15;15 Spec No 1:R51-6. doi: 10.1093/hmg/ddl056. Hum Mol Genet. 2006. PMID: 16651369 Review.
Cited by
-
Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes.BMC Genomics. 2012 Apr 24;13:144. doi: 10.1186/1471-2164-13-144. BMC Genomics. 2012. PMID: 22530965 Free PMC article.
-
The genome sequence of avian pathogenic Escherichia coli strain O1:K1:H7 shares strong similarities with human extraintestinal pathogenic E. coli genomes.J Bacteriol. 2007 Apr;189(8):3228-36. doi: 10.1128/JB.01726-06. Epub 2007 Feb 9. J Bacteriol. 2007. PMID: 17293413 Free PMC article.
-
Identification of evolutionary hotspots in the rodent genomes.Genome Res. 2004 Apr;14(4):574-9. doi: 10.1101/gr.1967904. Genome Res. 2004. PMID: 15059998 Free PMC article.
-
Compensatory relationship between splice sites and exonic splicing signals depending on the length of vertebrate introns.BMC Genomics. 2006 Dec 8;7:311. doi: 10.1186/1471-2164-7-311. BMC Genomics. 2006. PMID: 17156453 Free PMC article.
-
The genomic landscape of short insertion and deletion polymorphisms in the chicken (Gallus gallus) Genome: a high frequency of deletions in tandem duplicates.Genetics. 2007 Jul;176(3):1691-701. doi: 10.1534/genetics.107.070805. Epub 2007 May 16. Genetics. 2007. PMID: 17507681 Free PMC article.
References
-
- Boffelli, B., McAuliffe, J., Ovcharenko, D., Lewis, K.D., Ovcharenko, I., Pachter, L., and Rubin, E.M. 2003. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299: 1391-1394. - PubMed
-
- Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78-94. - PubMed
WEB SITE REFERENCES
-
- http://www.nisc.nih.gov/; NIH Intramural Sequencing Center.
-
- http://hiv-web.lanl.gov/; LANL HIV Databases.
-
- http://baboon.math.berkeley.edu/mavid/; The MAVID Web server.
-
- http://baboon.math.berkeley.edu/mavid/data/; Supplemental Data.
-
- http://hanuman.math.berkeley.edu/kbrowser/; K-BROWSER.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous