Genome re-annotation: a wiki solution?

Steven L Salzberg¹

Affiliations

Affiliation

¹ Center for Bioinformatics and Computational Biology and Department of Computer Science, 3125 Biomolecular Sciences Building, University of Maryland, College Park, MD 20742, USA. salzberg@umiacs.umd.edu.

PMID: 17274839
PMCID: PMC1839116
DOI: 10.1186/gb-2007-8-1-102

Genome re-annotation: a wiki solution?

Steven L Salzberg. Genome Biol. 2007.

. 2007;8(1):102.

doi: 10.1186/gb-2007-8-1-102.

Author

Steven L Salzberg¹

Affiliation

¹ Center for Bioinformatics and Computational Biology and Department of Computer Science, 3125 Biomolecular Sciences Building, University of Maryland, College Park, MD 20742, USA. salzberg@umiacs.umd.edu.

PMID: 17274839
PMCID: PMC1839116
DOI: 10.1186/gb-2007-8-1-102

Abstract

The annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately, annotation is rarely if ever updated and resources to support routine reannotation are scarce. Wiki software, which would allow many scientists to edit each genome's annotation, offers one possible solution.

PubMed Disclaimer

Figures

**Figure 1**
Overview of sequencing and annotation for a whole-genome shotgun project, for example, sequencing a bacterial genome. First **(a)**, genomic DNA is purified, broken into short fragments and cloned into *E. coli*. The cloned fragments are then sequenced from both ends on an automated sequencing machine. The resulting sequences (shown in **(b)** as they appear on the sequencing machine display) are then assembled using a complex software program that identifies overlaps into **(c)** large, contiguous sequences representing the chromosomes from the original DNA. Gaps are filled until the genome is complete. **(d)** Annotation begins with the execution of several gene-finding programs, such as Glimmer, which identifies protein-coding genes, tRNAScan, which identifies tRNAs, and other programs for other genome features. **(e)** These initial predictions are used as the basis for BLAST searches against large protein databases, which identify related proteins based on sequence similarity. Translated (Blastx) searches are then used to scan the databases to detect any proteins that match the DNA regions in between predicted genes. Customized annotation programs are used to decide what name and function to assign to each protein, leading to **(f)** the final annotated genome.

See this image and copyright information in PMC

References

1. GenBank http://www.ncbi.nlm.nih.gov/Genbank/
1. The International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. - DOI - PubMed
1. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science. 2001;291:1304–1351. doi: 10.1126/science.1058040. - DOI - PubMed
1. White O, Eisen JA, Heidelberg JF, Hickey EK, Peterson JD, Dodson RJ, Haft DH, Gwinn ML, Nelson WC, Richardson DL, et al. Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science. 1999;286:1571–1577. doi: 10.1126/science.286.5444.1571. - DOI - PMC - PubMed
1. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496–512. doi: 10.1126/science.7542800. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genome re-annotation: a wiki solution?

Affiliation

Genome re-annotation: a wiki solution?

Author

Affiliation

Abstract

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources