Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 19;34(12):3533-45.
doi: 10.1093/nar/gkl471. Print 2006.

AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system

Affiliations

AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system

K Bryson et al. Nucleic Acids Res. .

Abstract

We have implemented a genome annotation system for prokaryotes called AGMIAL. Our approach embodies a number of key principles. First, expert manual annotators are seen as a critical component of the overall system; user interfaces were cyclically refined to satisfy their needs. Second, the overall process should be orchestrated in terms of a global annotation strategy; this facilitates coordination between a team of annotators and automatic data analysis. Third, the annotation strategy should allow progressive and incremental annotation from a time when only a few draft contigs are available, to when a final finished assembly is produced. The overall architecture employed is modular and extensible, being based on the W3 standard Web services framework. Specialized modules interact with two independent core modules that are used to annotate, respectively, genomic and protein sequences. AGMIAL is currently being used by several INRA laboratories to analyze genomes of bacteria relevant to the food-processing industry, and is distributed under an open source license.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Views of the DNA sequence at different scales. The upper part of the figure represents an atlas view of the genome obtained with CGView. One can zoom on a particular region of this map, for instance on the area containing the mgsA gene: a methylglyoxal synthase that belongs to the glycolytic pathway. Clicking on this gene will open the MuGeN interface showing its genomic context (genome map frame). It is possible to zoom on this representation to see the DNA sequence and the translation in the six reading frames (sequence frame). The green symbol represents the RBS, the gene sequence is colored as in the previous view. The navigation window allows one to move along the genome, either by entering a range of base numbers, or by looking for a feature with a particular qualifier or by specifying a DNA or protein motif to be searched for in the current window or in the complete genome. The window at the lower left of the figure shows the gene editor. Most fields are automatically filled, in particular the gene annotation qualifiers since, in general, CDS annotation is performed in PAM and then updated in CAM (see Figure 4). Clicking on the link to PAM, indicated by the magnifying glass, will lead the annotator to the PAM interface shown in Figure 2. For clarity the Artemis interface is not shown on the figure.
Figure 2
Figure 2
The right part of the figure shows the results of the different bioinformatic methods applied to the sequence of MgsA. Not all result sections are shown here. In the ‘homology’ section, checking the boxes on the left of the homologous sequences and then clicking on the link to Jalview, below, will show the multiple alignment of the selected sequences and the corresponding phylogenetic tree. The left part of the figure shows the annotation window where the protein annotation is performed. Information entered in this section is forwarded to CAM, the system always makes sure that both managers are synchronized. The bottom of this window shows the annotation history. The link to CAM at the top will lead the annotator back to the CAM interface (MuGeN interface, see Figure 1). The link to PAREO (our relational version of the KEGG database) near the ‘EC number’ box will lead the user to the KEGG interface shown in Figure 3.
Figure 3
Figure 3
This figure shows both glycolysis and pyruvate metabolism pathways for Lactobacillus plantarum and L.sakei. As indicated in the legend at the top of the figure, enzymes that are only found in L.plantarum and L.sakei are colored respectively in red and purple. Enzymes found in both organisms are colored in green. The magnifying glasses are used to indicate the role of MgsA in these pathways. This enzyme appears to be involved in a methylglyoxal bypass (reversible reaction) of glycolysis in L.sakei. The figure illustrates well the major difference in glycolysis in L.plantarum and L.sakei. The bottom right box shows the product of the reaction catalyzed by MgsA. A detailed account of L.sakei energy production pathways contributing to meat adaptation can be found in ref. (38).
Figure 4
Figure 4
Dashed arrows represent automatic processes between managers, solid arrows represent human interaction with the managers. Graphic interfaces are described in Figures 1 and 2.

References

    1. Kunst F., Ogasawara N., Moszer I., Albertini A.M., Alloni G., Azevedo V., Bertero M.G., Bessieres P., et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature. 1997;390:249–256. - PubMed
    1. Stein L. Genome annotation: from sequence to biology. Nature Rev. Genet. 2001;2:493–503. - PubMed
    1. Kitano H. Systems biology: a brief overview. Science. 2002;295:1662–1664. - PubMed
    1. Bork P., Dandekar T., Diaz-Lazcoz Y., Eisenhaber F., Huynen M., Yuan Y. Predicting function: from genes to genomes and back. J. Mol. Biol. 1998;283:707–725. - PubMed
    1. Jefery C.J. Moonlighting proteins. Trends Biochem. Sci. 2000;24:8–11. - PubMed

Publication types

Substances