Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1997 Dec 12;274(4):562-76.
doi: 10.1006/jmbi.1997.1412.

A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure

Affiliations
Free article
Comparative Study

A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure

M Gerstein. J Mol Biol. .
Free article

Abstract

Representative genomes from each of the three kingdoms of life are compared in terms of protein structure, in particular, those of Haemophilus influenzae (a bacteria), Methanococcus jannaschii (an archaeon), and yeast (a eukaryote). The comparison is in the form of a census (or comprehensive accounting) of the relative occurrence of secondary and tertiary structures in the genomes, which particular emphasis on patterns of supersecondary structure. Comparison of secondary structure shows that the three genomes have nearly the same overall secondary-structure content, although they differ markedly in amino acid composition. Comparison of super-secondary structure, using a novel "frequent-words" approach, shows that yeast has a preponderance of consecutive strands (e.g. beta-beta-beta patterns), Haemophilus, consecutive helices (alpha-alpha-alpha), and Methanococcus, alternating helix-strand structures (beta-alpha-beta). Yeast also has significantly more helical membrane proteins than the other two genomes, with most of the differences concentrated in proteins containing two transmembrane segments. Comparison of tertiary structure (by sequence matching and domain-level clustering) highlights the substantial duplication in each genome (approximately 30% to 50%), with the degree of duplication following similar patterns in all three. Many sequence families are shared among the genomes, with the degree of overlap between any two genomes being roughly similar. In total, the three genomes contain 148 of the approximately 300 known protein folds. Forty-five of these 148 that are present in all three genomes are especially enriched in mixed super-secondary structures (alpha/beta). Moreover, the five most common of these 45 (the "top-5") have a remarkably similar super-secondary structure architecture, containing a central sheet of parallel strands with helices packed onto at least one face and beta-alpha-beta connections between adjacent strands. These most basic molecular parts, which, presumably, were present in the last common ancestor to the three Kingdoms, include the TIM-barrel, Rossmann, flavodoxin, thiamin-binding, and P-loop-hydrolase folds.

PubMed Disclaimer

Publication types

LinkOut - more resources