Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 May 23;289(21):14490-7.
doi: 10.1074/jbc.R114.548255. Epub 2014 Apr 2.

Enigmatic distribution, evolution, and function of inteins

Affiliations
Review

Enigmatic distribution, evolution, and function of inteins

Olga Novikova et al. J Biol Chem. .

Abstract

Inteins are mobile genetic elements capable of self-splicing post-translationally. They exist in all three domains of life including in viruses and bacteriophage, where they have a sporadic distribution even among very closely related species. In this review, we address this anomalous distribution from the point of view of the evolution of the host species as well as the intrinsic features of the inteins that contribute to their genetic mobility. We also discuss the incidence of inteins in functionally important sites of their host proteins. Finally, we describe instances of conditional protein splicing. These latter observations lead us to the hypothesis that some inteins have adapted to become sensors that play regulatory roles within their host protein, to the advantage of the organism in which they reside.

Keywords: Bioinformatics; Conditional Splicing; DNA Enzymes; Intein Gain and Loss; Intein Localization; Invasion; Microbiology; Molecular Evolution; Protein Splicing; Splicing.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.
Protein splicing and types of inteins. A, schematic mechanism of protein splicing. The intein is shown in red, flanked by the N-extein (EN) and the C-extein (EC). The four steps of protein splicing are designated by four arrows and are described by Mills et al. (2). The first residue of the intein and that of EC are usually a cysteine, serine, or threonine. They are shown here as Cys1 and Cys+1. B, protein trans-splicing by split inteins. The two halves of the split intein come together via a zipper-like interface. Splicing then proceeds as in A. C, an example of conditional protein splicing. A disulfide bridge, shown between cysteine residues at −3 of EN and at the first residue of the intein (Cys1), prevents protein splicing. In the presence of a reducing reagent, the disulfide bond is broken to release the trapped Cys1 and splicing proceeds. D, HEN-containing inteins are naturally occurring mobile genetic elements. After transcription and translation, precursor undergoes protein splicing with formation of the ligated exteins and intein carrying HEN domain (green). The HEN recognizes and cleaves its cognate intein-less homing site in DNA. The double-strand break is repaired by cellular double-strand break repair (DSBR) machinery using the intein-containing allele as template, resulting in two intron-containing alleles.
FIGURE 2.
FIGURE 2.
Distribution of inteins in the biosphere. The tree of life is shown after Gribaldo and Brochier-Armanet (93) with minor changes. Thirteen phyla are shown for Eubacteria and three are shown for Archaea. Three kingdoms are indicated for Eukaryota: metazoa or animals, fungi, and viridiplantae or plants (with emphasis on green algae, the Chlorophyta). The additional basal branch is dedicated to all other eukaryotes. Viruses and bacteriophage are not included in the tree and are shown separately at the bottom. In the shaded panels, red sectors show the percentage of Eubacteria, Archaea, Eukaryota, or viruses among all species (left) or inteins within InBase (right), for a total of 100% in each case. On the far right, the total number of inteins submitted to InBase is shown for each group as red circles, with the area of the circles corresponding to the number of inteins. The NCBI column shows results of a preliminary assessment for the intein occurrence in the NCBI Gene database. N/A = not available.
FIGURE 3.
FIGURE 3.
Distribution of inteins in proteome. A, intein concentration in proteins of DNA metabolism. Inteins tend to invade Pols, helicases (HELICs), TOPOs, and RNRs. Inteins occur in 27% of Pols, HELICs, TOPOs, and RNRs. The fraction of the inteins found in these proteins is shown for the three domains of life and viruses/bacteriophage. The numbers reflect data available in NCBI Protein database. The total number of the proteins with inteins is indicated on the top, represented by the black circle; the number and proportion (%) of Pols, HELICs, TOPOs, and RNRs with inteins are shown on the bottom, represented by a red circle; the number of proteins with intein inserted into P-loop is indicated on the side, represented by a gray circle. The area of the circles is proportional to the numbers. B, inteins in P-loops of some proteins. The P-loop is a conserved motif commonly found in ATP-binding domains with consensus amino acid sequence GXXGXGK(T/S) (X = any amino acid residue). The loop is a hot-spot for intein insertion in some of the host proteins such as RadA/RecA recombinase, replicative helicases DnaB and MCM, as well as the DNA clamp loader or RFC. The P-loop sequence is shown with the heights of the letters corresponding to the degree of sequence conservation; the most conserved residues are in red. Numbers in the dark gray circles on the top of protein representations correspond to the numbers of species/strains with an intein insertion at that position in the P-loop.

References

    1. Paulus H. (2001) Inteins as enzymes. Bioorg. Chem. 29, 119–129 - PubMed
    1. Mills K. V., Johnson M. A., Perler F. B. (2014) Protein splicing: how inteins escape from precursor proteins. J. Biol. Chem. 289, 14498–14505 - PMC - PubMed
    1. Volkmann G., Mootz H. D. (2013) Recent progress in intein research: from mechanism to directed evolution and applications. Cell. Mol. Life. Sci. 70, 1185–1206 - PMC - PubMed
    1. Muralidharan V., Muir T. W. (2006) Protein ligation: an enabling technology for the biophysical analysis of proteins. Nat. Methods 3, 429–438 - PubMed
    1. Xu M. Q., Evans T. C., Jr. (2005) Recent advances in protein splicing: manipulating proteins in vitro and in vivo. Curr. Opin. Biotechnol. 16, 440–446 - PubMed

Publication types