Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1985 May;67(5):437-43.
doi: 10.1016/s0300-9084(85)80261-3.

Computer generation and statistical analysis of a data bank of protein sequences translated from GenBank

Computer generation and statistical analysis of a data bank of protein sequences translated from GenBank

J M Claverie et al. Biochimie. 1985 May.

Abstract

We describe PGtrans, a new and freely available protein sequence databank (2625 sequences, 554198 amino-acids). This data bank is routinely produced by automatic computer translation of the nucleotide sequence library GenBank. The information needed for the translation process (transcriptional orientation, location of coding regions, splice sites and pertinent genetic code) is gathered by the translation program through an "intelligent" scanning of the documentary field of each GenBank entry. Inconsistencies resulting in unexpected termination codons are detected and reported thus allowing the correction of data bank errors. PGtrans is intended as a tool for protein similarity searches. Its reasonable overall size (2 Moctets) makes it suitable for micro-computer environments. Up to date amino-acid composition data and relative abundances of di-, tri-, and tetra-peptides in proteins of known sequences are presented and discussed.

PubMed Disclaimer

Publication types

Substances

LinkOut - more resources