Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009;4(3):e4981.
doi: 10.1371/journal.pone.0004981. Epub 2009 Mar 31.

Length variations amongst protein domain superfamilies and consequences on structure and function

Affiliations

Length variations amongst protein domain superfamilies and consequences on structure and function

Sankaran Sandhya et al. PLoS One. 2009.

Abstract

Background: Related protein domains of a superfamily can be specified by proteins of diverse lengths. The structural and functional implications of indels in a domain scaffold have been examined.

Methodology: In this study, domain superfamilies with large length variations (more than 30% difference from average domain size, referred as 'length-deviant' superfamilies and 'length-rigid' domain superfamilies (<10% length difference from average domain size) were analyzed for the functional impact of such structural differences. Our delineated dataset, derived from an objective algorithm, enables us to address indel roles in the presence of peculiar structural repeats, functional variation, protein-protein interactions and to examine 'domain contexts' of proteins tolerant to large length variations. Amongst the top-10 length-deviant superfamilies analyzed, we found that 80% of length-deviant superfamilies possess distant internal structural repeats and nearly half of them acquired diverse biological functions. In general, length-deviant superfamilies have higher chance, than length-rigid superfamilies, to be engaged in internal structural repeats. We also found that approximately 40% of length-deviant domains exist as multi-domain proteins involving interactions with domains from the same or other superfamilies. Indels, in diverse domain superfamilies, were found to participate in the accretion of structural and functional features amongst related domains. With specific examples, we discuss how indels are involved directly or indirectly in the generation of oligomerization interfaces, introduction of substrate specificity, regulation of protein function and stability.

Conclusions: Our data suggests a multitude of roles for indels that are specialized for domain members of different domain superfamilies. These specialist roles that we observe and trends in the extent of length variation could influence decision making in modeling of new superfamily members. Likewise, the observed limits of length variation, specific for each domain superfamily would be particularly relevant in the choice of alignment length search filters commonly applied in protein sequence analysis.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Distributions of domain length variations in members of the 353 multi-membered PASS2 domain superfamilies.
The degree of length variation for every member from the mean domain size of its superfamily was calculated by expressing as a ratio the length difference of each member to its mean domain size.
Figure 2
Figure 2. Members of the Cytochrome C- like domain superfamily (a–e) show two-fold length variation.
Additional residues contribute to differences in the lengths of loops around the substrate-binding site. Cytochrome-C552 (1c52–: Thermus thermophilus) acquires two β-strands that further protects the bound-heme (not shown) from solvent. All structures preserve the hydrophobic pocket involving at least three helices (shown in golden yellow) surrounding the heme group (not shown).
Figure 3
Figure 3. Domain members of the viral protein domain superfamily.
A single subunit of the adenovirus type 5 hexon (1rux) and P3 of the bacteriophage (1hx6) involves two viral jelly roll domains (1ruxa1, 1ruxa2 and 1hx6a1, 1hx6a2 respectively). All four members show a conservation of the structural scaffold involving the viral jelly roll (in green). The nature of structural variations acquired by each domain (in brown) varies and loop lengths vary extensively even within a subunit. Additionally, residues in adenovirus (three-fold difference in length) form a subdomain involved in more extensive subunit interactions.
Figure 4
Figure 4. Structures of the giant and dwarf domain members of the PLD domain superfamily.
Endonuclease (1byra-) and Phospholipase D (1f0i), the dwarf and giant domains of the PLD-like superfamily adopt different oligomeric states. Phospholipase D, a pseudo-dimer (1f0ia1: (256) and 1f0ia2: (240)), shows a duplication of the core domain of Endonuclease (1byra), which is a functional dimer. The PLD domain of endonuclease represents the minimum structural scaffold for acting on the phospho-diester bond of a substrate. The core conserved strands in either structure are highlighted in green. In endonuclease, residues from two HKD motifs (in red, ball and stick) from both protomers interact with the substrate. Phospholipase D has two copies of the motif and also shows some additional structures that protect the active site from solvent and move it deeper into the protein. Active site residues involve similar residues and lie in similar structural contexts (in ball and stick).
Figure 5
Figure 5. Domain members of the SAM domain-like superfamily.
ftsj (1ej0a-), Catechol-O methyl transferase (1vid–), VP39 (3mag–), PRMT3 (1f3la-), show insertions that do not affect the common core structural scaffold (in green). Residues that interact with the Adomet cofactor (ball and stick representation, in red) and others that interact with the different substrates (not shown) are spatially proximate and their locations are conserved across the different members. In Vp39 (3mag–), a large 100-residue insert in the C-terminus appears to shield the core scaffold. In PRMT3 (1f3la-), the truncated SAM domain acquires a large barrel-like extension at the C-terminus. This subdomain-like indel contributes some residues to substrate-binding and may adopt an auto-regulatory role by interacting with Adomet binding residues of the neighboring subunit during dimer formation.
Figure 6
Figure 6. Lysozyme-like superfamily.
Structures of lytic murein transglycosylase b (1qusa-, 321 residues) and insect lysozyme (1iiza-, 120 residues) show a well-conserved lysozyme-like fold (in green). Lytic murein transglycosylase acquires two additional N- and C-terminal subdomain like structures that are implicated in membrane interactions (highlighted in faint pink).
Figure 7
Figure 7. Actin-like ATPase domain superfamily.
Superposed structures of acetate kinase (242 residues, in gold) and actin alpha1 (142 residues, in blue) show that the giant member acquires longer helices. The additional helical insert observed in acetate kinase forms a closed loop that brings residues that interact with the substrate close to the Mg2+ ion binding site. In other dwarf members of the superfamily, the same residues are involved in both ion-binding and catalysis thus obviating the need for such extra structural elements. The lower panel shows a graphical projection of the alignments. Large differences in length contribute to insertions of different structural elements in either protein (Helix- red, strand – blue, coil – green, indels- magenta).

References

    1. Heringa J, Taylor WR. Three-dimensional domain duplication, swapping and stealing. Curr Opin Struct Biol. 1997;7:416–421. - PubMed
    1. Heringa J. Detection of internal repeats: how common are they? Curr Opin Struct Biol. 1998;8:338–345. - PubMed
    1. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–1155. - PubMed
    1. Orengo CA, Thornton JM. Protein families and their evolution-a structural perspective. Annu Rev Biochem. 2005;74:867–900. - PubMed
    1. Apic G, Gough J, Teichmann SA. Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J Mol Biol. 2001;310:311–325. - PubMed

Publication types