Domain insertions in protein structures

R Aroul-Selvam¹, Tim Hubbard, Rajkumar Sasidharan

Affiliations

PMID: 15099733
PMCID: PMC2665287
DOI: 10.1016/j.jmb.2004.03.039

Domain insertions in protein structures

R Aroul-Selvam et al. J Mol Biol. 2004.

. 2004 May 7;338(4):633-41.

doi: 10.1016/j.jmb.2004.03.039.

Authors

R Aroul-Selvam¹, Tim Hubbard, Rajkumar Sasidharan

Affiliation

¹ The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

PMID: 15099733
PMCID: PMC2665287
DOI: 10.1016/j.jmb.2004.03.039

Abstract

Domains are the structural, functional or evolutionary units of proteins. Proteins can comprise a single domain or a combination of domains. In multi-domain proteins, the domains almost always occur end-to-end, i.e., one domain follows the C-terminal end of another domain. However, there are exceptions to this common pattern, where multi-domain proteins are formed by insertion of one domain (insert) into another domain (parent). Here, we provide a quantitative description of known insertions in the Protein Data Bank (PDB). We found that 9% of domain combinations observed in non-redundant PDB are insertions. Although 90% of all insertions involve only one insert, proteins can clearly have multiple (nested, two-domain and three-domain) inserts. We also observed correlations between the structure and function of a domain and its tendency to be found as a parent or an insert. There is a bias in insert position towards the C terminus of parents. We observed that the atomic distance between the N and C terminus of an insert is significantly smaller when compared to the N-to-C distance in a parent context or a single domain context. Insertions are found always to occur in loop regions of parent domains. Our observations regarding the relationship between domain insertions and the structure, function and evolution of proteins have implications for protein engineering.

PubMed Disclaimer

Figures

**Figure 1**
Domain insertion in *E. coli* enzyme RNA 3′-terminal phosphate cyclase (PDB 1qmhA). The *E. coli* enzyme RNA 3′-terminal phosphate cyclase consists of two domains, of which one is inserted within the other. The parent domain (residues 5–184, 280–338, coloured purple) consists of three repeated folding units; each unit has two α-helices and a four-stranded β-sheet. The folding unit resembles the C-terminal domain of bacterial translation initiation factor 3 (IF3). Between an α-helix and a β-strand of the third IF3-like repeat of the parent domain, there is a smaller inserted domain (residues 185–279, coloured red). Although the inserted domain has the same secondary structural elements as the parent domain, it has a different topology and a different fold. Insert resembles the fold observed in human thioredoxin. The figure was prepared using the program MOLSCRIPT.

**Figure 3**
(a) Domain length distribution for all domains in the non-redundant set of protein structures (PDB_90). (b) Domain length distribution for parent domains.

**Figure 4**
(a) Proportion of residues in parent and insert domains in parent-insert combinations. (b) Point of insertion in parent domain. Insert position is given as a fraction of total length of parent domain.

See this image and copyright information in PMC

References

1. Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995;247:536–540. - PubMed
1. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. CATH—a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. - PubMed
1. Holm L, Sander C. Mapping the protein universe. Science. 1996;273:595–603. - PubMed
1. Chothia C. Proteins. One thousand families for the molecular biologist. Nature. 1992;357:543–544. - PubMed
1. Bork P, Downing AK, Kieffer B, Campbell ID. Structure and distribution of modules in extracellular proteins. Quart. Rev. Biophys. 1996;29:119–167. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Domain insertions in protein structures

Affiliation

Domain insertions in protein structures

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources