Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1997 May 23;268(5):857-68.
doi: 10.1006/jmbi.1997.1003.

Protein evolution viewed through Escherichia coli protein sequences: introducing the notion of a structural segment of homology, the module

Affiliations

Protein evolution viewed through Escherichia coli protein sequences: introducing the notion of a structural segment of homology, the module

M Riley et al. J Mol Biol. .

Abstract

Paralogous genes are genes which descend from a progenitor gene which has duplicated as an ancestral gene, each copy having diverged prior to speciation. With comprehensive information available on functions of Escherichia coli proteins, analysis of sequence-related E. coli paralogous proteins can give information on the early ancestors of families of proteins now residing in many contemporary organisms, such as the enzymes of metabolism, some kinds of transport mechanisms and some kinds of regulatory mechanisms. In the first step, we have confirmed that E. coli contains a very high proportion of paralogous proteins. Next, we have defined two main classes of paralogous proteins. One class is formed of proteins which contain a unique structural segment homologous to a single set of related proteins. The other class corresponds to proteins which contain more than one structural segment of homology, each segment homologous to unrelated sets of proteins. We define such an independent structural segment of homology as a module. This modular structure (mean length equivalent to 209 amino acids) corresponds often to entire proteins, but there are also proteins that appear to be assembled from two or three independent modules having independent origins. Most multimodular proteins appear to have been formed early in their history, a minority appear to be relatively recent fusions of independent modules. Examining 1404 independent structural segments of homology, composed of both modules and entire proteins, we found that the segments of homology fell into 352 sequence-related groups or families. The majority of these families (ranging from 2 to 62 members) are functionally homogeneous. This strongly suggests that the 1404 present-day modules and proteins derive from a minimal set of 352 ancestral modules, each one being already of the same size and having a function similar to all members of its progeny.

PubMed Disclaimer

MeSH terms

Substances

LinkOut - more resources