Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2002 Nov;12(11):1642-50.
doi: 10.1101/gr.520702.

Signatures of domain shuffling in the human genome

Affiliations
Comparative Study

Signatures of domain shuffling in the human genome

Henrik Kaessmann et al. Genome Res. 2002 Nov.

Abstract

To elucidate the role of exon shuffling in shaping the complexity of the human genome/proteome, we have systematically analyzed intron phase distributions in the coding sequence of human protein domains. We found that introns at the boundaries of domains show high excess of symmetrical phase combinations (i.e., 0-0, 1-1, and 2-2), whereas nonboundary introns show no excess symmetry. This suggests that exon shuffling has primarily involved rearrangement of structural and functional domains as a whole. Furthermore, we found that domains flanked by phase 1 introns have dramatically expanded in the human genome due to domain shuffling and that 1-1 symmetrical domains and domain families are nonrandomly distributed with respect to their age. The predominance and extracellular location of 1-1 symmetrical domains among domains specific to metazoans suggests that they are associated with the rise of multicellularity. On the other hand, 0-0 symmetrical domains tend to be over-represented among ancient protein domains that are shared between the eukaryotic and prokaryotic kingdoms, which is compatible with the suggestion of primordial domain shuffling in the progenote. To see whether the human data reflect general genomic patterns of metazoans, similar analyses were done for the nematode Caenorhabditis elegans. Although the C. elegans data generally concur with the human patterns, we identified fewer intron-bounded domains in this organism, consistent with the lower complexity of C. elegans genes. [The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: Z. Gu and R. Stevens.]

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of a class 1 domain (A) and a class 2 domain (B) encoded by four exons. The different types of introns and exons in the coding sequence of class 2 domains (as used in the analysis of intron phases) are indicated (see text for details).
Figure 2
Figure 2
Intron-phase combinations of the discoidin domain in 10 human genes representing 7 gene families (A–G). Boxes represent the discoidin domain (a) and its neighboring domains (b–i) (not drawn to scale). Numbers indicate the phase class of the introns. Phase 1 introns found at the boundaries of the discoidin domain are shown in boldface. The seven different Ensembl gene families (http://www.ensembl.org) are as follows: (A) endothelial and muscle cell-derived neuropilin-like protein, (B) lactadherin milk fat globule EGF factor, (C) neuropilin precursor vascular endothelial cell growth factor, (D) carboxypeptidase H, (E) contactin-associated protein like, (F) discoidin domain receptor, and (G) coagulation factor VIII precursor. The letters within the boxes refer to the following Pfam signatures: discoidin domain (a), EGF-like domain (b), CUB domain (c), MAM domain (d), zinc carboxypeptidase (e), laminin G type domain (f), fibrinogen carboxy-terminal globular domain (g), protein kinase domain (h), and multicopper oxidase (i).

Similar articles

Cited by

References

    1. Andrade MA, Perez-Iratxeta C, Ponting CP. Protein repeats: Structures, functions, and evolution. J Struct Biol. 2001;134:117–131. - PubMed
    1. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL. The Pfam protein families database. Nucleic Acids Res. 2002;30:276–280. - PMC - PubMed
    1. Beckmann G, Bork P. An adhesive domain detected in functionally diverse receptors. Trends Biochem Sci. 1993;18:40–41. - PubMed
    1. Betts MJ, Guigo R, Agarwal P, Russell RB. Exon structure conservation despite low sequence similarity: A relic of dramatic events in evolution? EMBO J. 2001;20:5354–5360. - PMC - PubMed
    1. Cavalier-Smith T. Selfish DNA and the origin of introns. Nature. 1985;315:283–284. - PubMed

Publication types

MeSH terms