Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003;4(2-3):67-78.
doi: 10.1023/a:1026113408773.

Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination

Affiliations
Comparative Study

Multi-domain protein families and domain pairs: comparison with known structures and a random model of domain recombination

Gordana Apic et al. J Struct Funct Genomics. 2003.

Abstract

There is a limited repertoire of domain families in nature that are duplicated and combined in different ways to form the set of proteins in a genome. Most proteins in both prokaryote and eukaryote genomes consist of two or more domains, and we show that the family size distribution of multi-domain protein families follows a power law like that of individual families. Most domain pairs occur in four to six different domain architectures: in isolation and in combinations with different partners. We showed previously that within the set of all pairwise domain combinations, most small and medium-sized families are observed in combination with one or two other families, while a few large families are very versatile and combine with many different partners. Though this may appear to be a stochastic pattern, in which large families have more combination partners by virtue of their size, we establish here that all the domain families with more than three members in genomes are duplicated more frequently than would be expected by chance considering their number of neighbouring domains. This duplication of domain pairs is statistically significant for between one and three quarters of all families with seven or more members. For the majority of pairwise domain combinations, there is no known three-dimensional structure of the two domains together, and we term these novel combinations. Novel domain combinations are interesting and important targets for structural elucidation, as the geometry and interaction between the domains will help understand the function and evolution of multi-domain proteins. Of particular interest are those combinations that occur in the largest number of multi-domain proteins, and several of these frequent novel combinations contain DNA-binding domains.

PubMed Disclaimer

References

    1. Genome Res. 2002 Oct;12(10):1619-23 - PubMed
    1. Nucleic Acids Res. 2002 Jan 1;30(1):268-72 - PubMed
    1. Nucleic Acids Res. 2002 Jan 1;30(1):264-7 - PubMed
    1. Genome Res. 2001 Oct;11(10):1632-40 - PubMed
    1. Trends Biotechnol. 2001 Dec;19(12):482-6 - PubMed

Publication types