Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Oct 5:7:433.
doi: 10.1186/1471-2105-7-433.

M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species

Affiliations

M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species

Todd J Treangen et al. BMC Bioinformatics. .

Abstract

Background: Due to recent advances in whole genome shotgun sequencing and assembly technologies, the financial cost of decoding an organism's DNA has been drastically reduced, resulting in a recent explosion of genomic sequencing projects. This increase in related genomic data will allow for in depth studies of evolution in closely related species through multiple whole genome comparisons.

Results: To facilitate such comparisons, we present an interactive multiple genome comparison and alignment tool, M-GCAT, that can efficiently construct multiple genome comparison frameworks in closely related species. M-GCAT is able to compare and identify highly conserved regions in up to 20 closely related bacterial species in minutes on a standard computer, and as many as 90 (containing 75 cloned genomes from a set of 15 published enterobacterial genomes) in an hour. M-GCAT also incorporates a novel comparative genomics data visualization interface allowing the user to globally and locally examine and inspect the conserved regions and gene annotations.

Conclusion: M-GCAT is an interactive comparative genomics tool well suited for quickly generating multiple genome comparisons frameworks and alignments among closely related species. M-GCAT is freely available for download for academic and non-commercial use at: http://alggen.lsi.upc.es/recerca/align/mgcat/intro-mgcat.html.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An approximate phylogeny of genome comparison tools over the past 30 years. Tracing the growth in related global genome comparison tools over the past 30 years.
Figure 2
Figure 2
The M-GCAT parameter page. The M-GCAT user interface parameter page. When M-GCAT is started, this is displayed to allow the user to select the input sequences, modify the main parameters, and load previously saved M-GCAT comparisons.
Figure 3
Figure 3
The M-GCAT genome comparison workspace. The M-GCAT genome comparison workspace showing the multi-MUMs, multi-MUM clusters, global multiple alignment, gene map, and an orthologous gene between four complete bacterial sequences: Yersinia pestis biovar Mediaevails, Yersinia pestis CO92, and Yersinia pseudotuberculosis IP32953. By analyzing the visual results for this comparison we can quickly observe that these sequences are highly similar, and except for a few smaller regions, there is high sequence identity across all genomes. The green vertical rectangles represent multi-MUM clusters, and the inverted green vertical rectangles indicate regions containing large-scale rearrangements. The highlighted(light green) multi-MUM cluster is an example of a region that was aligned among all genomes. In the gene map window genes are drawn as horizontal rectangles, and all genes annotated in the corresponding PTT file will be displayed. The genes are color coded by function, and a legend is provided at the bottom for quick reference when analyzing the genomes. The vertical lines between the genes represent the multi-MUMs found during comparison.
Figure 4
Figure 4
Multiple genome comparison framework for 15 microbial genomes. The M-GCAT results of a comparison showing the global alignment framework constructed for the 15 enterobacterial genomes used in sequence set 19. There are 1218 multi-MUM clusters displayed, covering approximately 54.9% of the total genomic sequence. The region highlighted in green and indicated with the black arrow is one of the 1218 regions found to be highly conserved among the 15 closely related species.
Figure 5
Figure 5
Analysis of multiple genome comparison framework efficiency and memory usage. This experiment was ran exclusively on a 2 Ghz Pentium M processor, with 2 GB of main memory, running Windows XP Professional. The memory usage as the peak memory usage during the comparison. The time is represented in total cpu time.

Similar articles

Cited by

References

    1. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer MLI, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, Mcdade KE, Mckenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376–380. - PMC - PubMed
    1. Liolos K, Tavernarakis N, Hugenholtz P, Kyrpides N. The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acids Research. 2006;34:D332–334. doi: 10.1093/nar/gkj145. - DOI - PMC - PubMed
    1. Morgenstern B, French K, Dress A, Werner T. DIALIGN: finding local similarities by multiple sequence alignment. Bionformatics. 1998;14:290–294. doi: 10.1093/bioinformatics/14.3.290. - DOI - PubMed
    1. Katoh K, Misasa K, Kuma K, Miyata T. MAFFT: a novel moethod for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059–3066. doi: 10.1093/nar/gkf436. - DOI - PMC - PubMed
    1. Schwartz S, Zhang Z, Frazer K, Smit A, Riemer C, Bouck J, Gibbs R, W Miller RH. PipMaker: A web resource for aligning two genomic DNA sequences. Genome Res. 2000;10:577–586. doi: 10.1101/gr.10.4.577. - DOI - PMC - PubMed

Publication types