Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Jun 10;33(10):3390-400.
doi: 10.1093/nar/gki615. Print 2005.

Protein length in eukaryotic and prokaryotic proteomes

Affiliations

Protein length in eukaryotic and prokaryotic proteomes

Luciano Brocchieri et al. Nucleic Acids Res. .

Abstract

We analyzed length differences of eukaryotic, bacterial and archaeal proteins in relation to function, conservation and environmental factors. Comparing Eukaryotes and Prokaryotes, we found that the greater length of eukaryotic proteins is pervasive over all functional categories and involves the vast majority of protein families. The magnitude of these differences suggests that the evolution of eukaryotic proteins was influenced by processes of fusion of single-function proteins into extended multi-functional and multi-domain proteins. Comparing Bacteria and Archaea, we determined that the small but significant length difference observed between their proteins results from a combination of three factors: (i) bacterial proteomes include a greater proportion than archaeal proteomes of longer proteins involved in metabolism or cellular processes, (ii) within most functional classes, protein families unique to Bacteria are generally longer than protein families unique to Archaea and (iii) within the same protein family, homologs from Bacteria tend to be longer than the corresponding homologs from Archaea. These differences are interpreted with respect to evolutionary trends and prevailing environmental conditions within the two prokaryotic groups.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relative median length of proteins within major functional classes in Eukaryotes (Euk), Bacteria (Bac) and Archaea (Arc). Lengths are normalized by the global median length within each phylum. Major functional classes follow the definition in COG (see also Table 3): Isp, information storage and processes; Cp, cellular processes; Me, metabolism; Pc, poorly characterized. N.C. signifies proteins not classified in the COG database.
Figure 2
Figure 2
Representation in eukaryotic and prokaryotic proteomes of proteins belonging to the major functional classes. Isp, information storage and processes; Cp, cellular processes; Me, metabolism; Pc, poorly characterized. N.C. signifies proteins not classified in the COG database.
Figure 3
Figure 3
Relation of median length of genomic proteins included in the Pfam-A database of curated alignments and OGT of the corresponding organism. Each point represents the median protein length within each bacterial (red) or archaeal (green) species.

References

    1. Galperin M.Y., Tatusov R.L., Koonin E.V. In: Organization of the Prokaryotic Genome. Charlebois R.L., editor. Washington, DC: ASM Press; 1999.
    1. Zhang J. Protein-length distributions for the three domains of life. Trends Genet. 2000;16:107–109. - PubMed
    1. Liang P., Riley M. A comparative genomics approach for studying ancestral proteins and evolution. Adv. Appl. Microbiol. 2001;50:39–72. - PubMed
    1. Skovgaard M., Jensen L.J., Brunak S., Ussery D., Krogh A. On the total number of genes and their length distribution in complete microbial genomes. Trends Genet. 2001;17:425–428. - PubMed
    1. Karlin S., Brocchieri L., Trent J., Blaisdell B.E., Mrazek J. Heterogeneity of genome and proteome content in bacteria, archaea, and eukaryotes. Theor. Popul. Biol. 2002;61:367–390. - PubMed

Publication types