Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Jan 1;28(1):60-4.
doi: 10.1093/nar/28.1.60.

EcoGene: a genome sequence database for Escherichia coli K-12

Affiliations

EcoGene: a genome sequence database for Escherichia coli K-12

K E Rudd. Nucleic Acids Res. .

Abstract

The EcoGene database provides a set of gene and protein sequences derived from the genome sequence of Escherichia coli K-12. EcoGene is a source of re-annotated sequences for the SWISS-PROT and Colibri databases. EcoGene is used for genetic and physical map compilations in collaboration with the Coli Genetic Stock Center. The EcoGene12 release includes 4293 genes. EcoGene12 differs from the GenBank annotation of the complete genome sequence in several ways, including (i) the revision of 706 predicted or confirmed gene start sites, (ii) the correction or hypothetical reconstruction of 61 frame-shifts caused by either sequence error or mutation, (iii) the reconstruction of 14 protein sequences interrupted by the insertion of IS elements, and (iv) pre-dictions that 92 genes are partially deleted gene fragments. A literature survey identified 717 proteins whose N-terminal amino acids have been verified by sequencing. 12 446 cross-references to 6835 literature citations and s are provided. EcoGene is accessible at a new website: http://bmb.med.miami.edu/EcoGene/EcoWeb. Users can search and retrieve individual EcoGene GenePages or they can download large datasets for incorporation into database management systems, facilitating various genome-scale computational and functional analyses.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Three tables of EcoGene data. (A) XREF12 contains cross-references of EcoGene EG accession numbers (EcoGene) and gene names (GN) to E.coli gene and protein record accession numbers in the SWISS-PROT, Coli Genetic Stock Center (CGSC) and GenBank databases, as well as to the University of Wisconsin ‘b’ numbers. (B) EGMAP12 contains genomic sequence and map locations for E.coli genes. EcoGene, EG accession numbers; GN, gene name; ORI, orientation of transcription; LeftEnd, the counterclockwise end of a gene; RightEnd, the clockwise end of a gene; CS, the centisome (= % = minute) map position of a gene, derived by dividing the LeftEnd basepair position by the length of the genome sequence (4 639 221 bp). (C) EGMAIN12 contains descriptive information about E.coli genes: the primary gene name (GN), the gene name mnemonic (MN), the gene description (GD), the gene type (GT; PROT or RNA), the gene product sequence length (LEN) and gene quality (GQ) fields.
Figure 2
Figure 2
The EcoWeb GenePage for the E.coli ptsA gene. The GenePage contains descriptive and genomic position information, hyperlinks to DNA and protein sequences and hyperlinks to gene records in other databases. The Gene Quality information indicates that a frameshift sequencing error has been corrected and that the N-terminus of the protein has been extended by 313 amino acids. The inset shows the Bibliography page for ptsA.
Figure 3
Figure 3
A portion of the EcoMap12 Adobe Acrobat PDF format genome map file. The format is identical to that from EcoMap10 in edition 10 of the E.coli linkage map (8). DNA sequence derived restriction sites (top line to bottom line) are BamHI, HindIII, EcoRI, EcoRV, BglI, KpnI, PtsI and PvuII. Also depicted are kilobase coordinates, centisomes, Kohara clone map positions, GenBank MG1655 genome sequence record alignments, gene positions and gene orientations.

Similar articles

Cited by

References

    1. Berlyn M.B., Low,K.B. and Rudd,K.E. (1996) In Neidhardt,F.C., Curtiss,R., Ingraham,J.L., Lin,E.C.C., Low,K.B., Magasanik,B., Reznikoff,W.S., Riley,M., Schaechter,M. and Umbarger,H.E. (eds), Escherichia coli and Salmonella: Cellular and Molecular Biology. ASM Press, Washington, DC, Vol. 2, pp. 1715–1902.
    1. Rudd K.E. (1993) ASM News, 59, 335–341.
    1. Rudd K.E., Miller,W., Werner,C., Ostell,J., Tolstoshev,C. and Satterfield,S.G. (1991) Nucleic Acids Res., 19, 637–647. - PMC - PubMed
    1. Borodovsky M., Koonin,E.V. and Rudd,K.E. (1994) Trends Biochem. Sci., 19, 309–313. - PubMed
    1. Rudd K.E. and Schneider,T.D. (1992) In Miller,J. (ed.), A Short Course in Bacterial Genetics; a Laboratory Manual and Handbook for Escherichia coli and Related Bacteria. Cold Spring Harbor Press, Cold Spring Harbor, NY, pp. 17.19–17.45.

Publication types