Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 1997 Jul 22;94(15):7698-703.
doi: 10.1073/pnas.94.15.7698.

Origin of genes

Affiliations

Origin of genes

W Gilbert et al. Proc Natl Acad Sci U S A. .

Abstract

We discuss two tests of the hypothesis that the first genes were assembled from exons. The hypothesis of exon shuffling in the progenote predicts that intron phases will be correlated so that exons will be an integer number of codons and predicts that the exons will be correlated with compact regions of polypeptide chain. These predictions have been tested on ancient conserved proteins (proteins without introns in prokaryotes but with introns in eukaryotes) and hold with high statistical significance. We conclude that introns are correlated with compact features of proteins 15-, 22-, or 30-amino acid residues long, as was predicted by "The Exon Theory of Genes."

PubMed Disclaimer

Figures

Figure 1
Figure 1
Go plot for horse hemoglobin. The black spots represent pairs of amino acids whose α-carbons are separated by 28 Å or more. The five large triangles correspond to modules. Boundary regions (BR) are defined by the overlap of these triangles.
Figure 2
Figure 2
χ2 distribution for the matching of intron positions to the boundary regions of 32 ancient proteins as a function of module diameter. The 570 intron positions were drawn from version 90 of GenBank. There are three major peaks of significance around module diameters of 21, 28, and 33 Å.
Figure 3
Figure 3
Lengths of predicted modules for the peaks of significance around 21, 28, and 33 Å. The three peaks correspond to distributions centered around 15, 22, and 30 amino acid residues in length.
Figure 4
Figure 4
The same analysis shown in Fig. 2 (dashed line) was repeated using a database of intron positions based on GenBank version 96 (662 intron positions, continuous line). The peaks around 21, 28, and 33 Å now reach χ2 values around 19, 15, and 13, respectively.

References

    1. Doolittle W F. Nature (London) 1978;272:581–582.
    1. Gilbert W. Cold Spring Harbor Symp Quant Biol. 1987;52:901–905. - PubMed
    1. Palmer J D, Logsdon J M J. Curr Opin Genet Dev. 1991;1:470–477. - PubMed
    1. Cavalier-Smith, C. C. F. (1978) J. Cell Sci. - PubMed
    1. Stoltzfus A, Spencer D F, Zuker M, Logsdon J M J, Doolittle W F. Science. 1994;265:202–207. - PubMed

Publication types