GC content-independent amino acid patterns in bacteria and archaea
- PMID: 21780150
- DOI: 10.1002/jobm.201100067
GC content-independent amino acid patterns in bacteria and archaea
Abstract
Every organism can be characterized by the amino acid composition of its proteome. So far it was assumed that these compositions are determined by the GC content of the DNA or, in some cases, by extreme lifestyles, like thermophily or halophily. Here, we focussed our analysis on eight amino acids, each of which is encoded by both, GC and AT rich codons, to identify finer amino acid patterns beyond the GC dominance. We investigated the conceptually translated proteomes of 1029 bacterial and archaeal strains with sequenced genomes for amino acid composition. Using correspondence analysis, we found that phylogenetic groups within bacteria and archaea generally can be discriminated from other groups due to their amino acid composition. In some cases, single organisms, e.g. Treponema pallidum strains or Mycoplasma penetrans, are characterized by extreme amino acid compositions. We assume that our data could provide a basis for a new approach to analyze evolution of bacterial and archaeal groups. Furthermore, for single organisms, the detailed knowledge of the amino acid composition of the entire proteome encoded in the genome could lead to a better understanding, important for pharmaceutical or biotechnological applications. We recommend that information about amino acid compositions should be provided in databases, comparable to the GC content of genomes.
Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous