Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007;8(1):102.
doi: 10.1186/gb-2007-8-1-102.

Genome re-annotation: a wiki solution?

Affiliations

Genome re-annotation: a wiki solution?

Steven L Salzberg. Genome Biol. 2007.

Abstract

The annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately, annotation is rarely if ever updated and resources to support routine reannotation are scarce. Wiki software, which would allow many scientists to edit each genome's annotation, offers one possible solution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of sequencing and annotation for a whole-genome shotgun project, for example, sequencing a bacterial genome. First (a), genomic DNA is purified, broken into short fragments and cloned into E. coli. The cloned fragments are then sequenced from both ends on an automated sequencing machine. The resulting sequences (shown in (b) as they appear on the sequencing machine display) are then assembled using a complex software program that identifies overlaps into (c) large, contiguous sequences representing the chromosomes from the original DNA. Gaps are filled until the genome is complete. (d) Annotation begins with the execution of several gene-finding programs, such as Glimmer, which identifies protein-coding genes, tRNAScan, which identifies tRNAs, and other programs for other genome features. (e) These initial predictions are used as the basis for BLAST searches against large protein databases, which identify related proteins based on sequence similarity. Translated (Blastx) searches are then used to scan the databases to detect any proteins that match the DNA regions in between predicted genes. Customized annotation programs are used to decide what name and function to assign to each protein, leading to (f) the final annotated genome.

References

    1. GenBank http://www.ncbi.nlm.nih.gov/Genbank/
    1. The International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. - DOI - PubMed
    1. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science. 2001;291:1304–1351. doi: 10.1126/science.1058040. - DOI - PubMed
    1. White O, Eisen JA, Heidelberg JF, Hickey EK, Peterson JD, Dodson RJ, Haft DH, Gwinn ML, Nelson WC, Richardson DL, et al. Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science. 1999;286:1571–1577. doi: 10.1126/science.286.5444.1571. - DOI - PMC - PubMed
    1. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496–512. doi: 10.1126/science.7542800. - DOI - PubMed

LinkOut - more resources