Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001;2(1):INTERACTIONS0001.
doi: 10.1186/gb-2000-2-1-interactions0001. Epub 2000 Dec 29.

Genome sequences and great expectations

Affiliations

Genome sequences and great expectations

I Iliopoulos et al. Genome Biol. 2001.

Abstract

To assess how automatic function assignment will contribute to genome annotation in the next five years, we have performed an analysis of 31 available genome sequences. An emerging pattern is that function can be predicted for almost two-thirds of the 73,500 genes that were analyzed. Despite progress in computational biology, there will always be a great need for large-scale experimental determination of protein function.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A summary of the annotation levels for 31 genomes. Annotations for all genomes (for 73,500 unique genes, 134,000 annotations in total - approximately a twofold annotation coverage) are available on the world wide web at the European Bioinformatics Institute Computational Genomics Group Services page [15] - then point and click at 'GeneQuiz'. Total computation required 2,400 CPU-hrs on a 16-CPU SGI Power Challenge and 68GB of storage. Results for other genomes will be made available at the same URL as they are completed. (a) Information snapshot for 31 entire genomes and a eukaryotic chromosome (Plasmodium falciparum, chromosome 2). For species (and strain) name abbreviations, please refer to the website [15]. Bacteria are shown in black, Archaea in red and Eukarya in blue. Percentages for proteins with homologs of known structure (pink) or function (blue), hypothetical proteins (dark brown) and unique proteins (light brown) are shown. Species are sorted according to the sum of structure and function information; the horizontal line represents the average of known/predicted functions across species. Diamonds (bottom panel) represent the percentage increase in new findings over the original (or public database) annotations (except Drosophila melanogaster, for which such comparison is not currently possible). This percentage range, ranging from 0 to 20, is indicated in brackets. (b) An 'information clock' for the genome of Haemophilus influenzae, showing the relative levels of annotation over time, reflecting a general increase of information in the public databases. Colours are used as in (a).

Similar articles

  • Personal genomes: Standard and pores.
    Sanderson K. Sanderson K. Nature. 2008 Nov 6;456(7218):23-5. doi: 10.1038/456023a. Nature. 2008. PMID: 18987710 No abstract available.
  • [A long-awaited revolution].
    Jordan B. Jordan B. Med Sci (Paris). 2008 Oct;24(10):869-73. doi: 10.1051/medsci/20082410869. Med Sci (Paris). 2008. PMID: 18950585 Review. French. No abstract available.
  • [The forefront of metagenomics].
    Mori H, Hayashi T, Kurokawa K. Mori H, et al. Tanpakushitsu Kakusan Koso. 2009 Aug;54(10):1264-70. Tanpakushitsu Kakusan Koso. 2009. PMID: 19663253 Review. Japanese. No abstract available.
  • WHEN TWO IS BETTER THAN ONE.
    Perkel J. Perkel J. Biotechniques. 2016 Feb 1;60(2):56-60. doi: 10.2144/000114376. eCollection 2016 Feb. Biotechniques. 2016. PMID: 26842349
  • DNA sequencers: the next generation.
    Mukhopadhyay R. Mukhopadhyay R. Anal Chem. 2009 Mar 1;81(5):1736-40. doi: 10.1021/ac802712u. Anal Chem. 2009. PMID: 19193124 No abstract available.

Cited by

References

    1. Butler D, Smaglik P. Draft data leave geneticists with a mountain still to climb. Nature. 2000;405:984–985. doi: 10.1038/35016703. - DOI - PubMed
    1. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269:496–512. - PubMed
    1. Casari G, Andrade MA, Bork P, Boyle J, Daruvar A, Ouzounis C, Schneider R, Tamames J, Valencia A, Sander C. Challenging times for bioinformatics. Nature. 1995;376:647–648. - PubMed
    1. Kyrpides NC. Genomes OnLine Database (GOLD 1.0): a monitor of complete and ongoing genome projects worldwide. Bioinformatics. 1999;15:773–774. doi: 10.1093/bioinformatics/15.9.773. - DOI - PubMed
    1. Andrade MA, Brown NP, Leroy C, Hoersch S, de Daruvar A, Reich C, Franchini A, Tamames J, Valencia A, Ouzounis C, Sander C. Automated genome sequence analysis and annotation. Bioinformatics. 1999;15:391–412. doi: 10.1093/bioinformatics/15.5.391. - DOI - PubMed

Publication types

LinkOut - more resources