Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan;14(1):1-12.
doi: 10.1093/bib/bbs007. Epub 2012 Mar 9.

The automatic annotation of bacterial genomes

Affiliations

The automatic annotation of bacterial genomes

Emily J Richardson et al. Brief Bioinform. 2013 Jan.

Abstract

With the development of ultra-high-throughput technologies, the cost of sequencing bacterial genomes has been vastly reduced. As more genomes are sequenced, less time can be spent manually annotating those genomes, resulting in an increased reliance on automatic annotation pipelines. However, automatic pipelines can produce inaccurate genome annotation and their results often require manual curation. Here, we discuss the automatic and manual annotation of bacterial genomes, identify common problems introduced by the current genome annotation process and suggests potential solutions.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
A generic process for bacterial genome annotation.
Figure 2:
Figure 2:
The six different models present across 17 RefSeq entries for Salmonella species for the eutM/eutN locus. Green indicates normal gene/CDS features, lighter grey indicates gene features annotated as pseudogenes. (A) A single intact gene of 690 bp; (B) a single pseudogene of 690 bp; (C) two short intact genes ∼300 bp in length; (D) one pseudogene and one intact gene, each ∼300 bp in length; (E) two pseudogenes, each 300 bp in length; and (F) two intact genes with the order reversed.
Figure 3:
Figure 3:
A syntenic block of genes showing inconsistent gene name annotations in E.coli K12 MG1655 and E. coli 0157:H7 Sakai.
Figure 4:
Figure 4:
A diagram displaying the processes that can lead to, and define, orthologs and paralogs. Gene duplication and speciation events create complex evolutionary relationships between genes.

References

    1. MacLean D, Jones JD, Studholme DJ. Application of ‘next-generation' sequencing technologies to microbial genetics. Nat Rev Microbiol. 2009;7:287–96. - PubMed
    1. Stothard P, Wishart DS. Automated bacterial genome analysis and annotation. Curr Opin Microbiol. 2006;9:505–10. - PubMed
    1. Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet. 2010;11:31–46. - PubMed
    1. Attwood TK, Bradley P, Flower DR, et al. PRINTS and its automatic supplement, prePRINTS. Nucleic Acids Res. 2003;31:400–2. - PMC - PubMed
    1. Suzek BE, Ermolaeva MD, Schreiber M, et al. A probabilistic method for identifying start codons in bacterial genomes. Bioinformatics. 2001;17:1123–30. - PubMed

Publication types