Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 4;7(5):e01537-16.
doi: 10.1128/mBio.01537-16.

Mycobacteriophages as Incubators for Intein Dissemination and Evolution

Affiliations

Mycobacteriophages as Incubators for Intein Dissemination and Evolution

Danielle S Kelley et al. mBio. .

Abstract

Inteins are self-splicing protein elements that are mobile at the DNA level and are sporadically distributed across microbial genomes. Inteins appear to be horizontally transferred, and it has been speculated that phages may play a role in intein distribution. Our attention turns to mycobacteriophages, which infect mycobacteria, where both phage and host harbor inteins. Using bioinformatics, mycobacteriophage genomes were mined for inteins. This study reveals that these mobile elements are present across multiple mycobacteriophage clusters and are pervasive in certain genes, like the large terminase subunit TerL and a RecB-like nuclease, with the majority of intein-containing genes being phage specific. Strikingly, despite this phage specificity, inteins localize to functional motifs shared with bacteria, such that intein-containing genes have similar roles, like hydrolase activity and nucleic acid binding, indicating a global commonality among intein-hosting proteins. Additionally, there are multiple insertion points within active centers, implying independent invasion events, with regulatory implications. Several phage inteins were shown to be splicing competent and to encode functional homing endonucleases, important for mobility. Further, bioinformatic analysis supports the potential for phages as facilitators of intein movement among mycobacteria and related genera. Analysis of catalytic intein residues finds the highly conserved penultimate histidine inconsistently maintained among mycobacteriophages. Biochemical characterization of a noncanonical phage intein shows that this residue influences precursor accumulation, suggesting that splicing has been tuned in phages to modulate generation of important proteins. Together, this work expands our understanding of phage-based intein dissemination and evolution and implies that phages provide a context for evolution of splicing-based regulation.

Importance: Inteins are mobile protein splicing elements found in critical genes across all domains of life. Mycobacterial inteins are of particular interest because of their occurrence in pathogenic species, such as Mycobacterium tuberculosis and Mycobacterium leprae, which harbor inteins in important proteins. We have discovered a similarity in activities of intein-containing proteins among mycobacteriophages and their intein-rich actinobacterial hosts, with implications for both posttranslational regulation by inteins and phages participating in horizontal intein transfer. Our demonstration of multiple insertion points within active centers of phage proteins implies independent invasion events, indicating the importance of intein maintenance at specific functional sites. The variable conservation of a catalytic splicing residue, leading to profoundly altered splicing rates, points to the regulatory potential of inteins and to mycobacteriophages playing a role in intein evolution. Collectively, these results suggest inteins as posttranslational regulators and mycobacteriophages as both vehicles for intein distribution and incubators for intein evolution.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Overview of intein distribution in mycobacteriophages. (A) Distribution of inteins among mycobacteriophage clusters. The number of intein-positive genomes (red, value below the circle) was compared to the total number of sequenced phage genomes (gray, value above the circle) in a cluster. “Other” includes clusters D, G to I, K to Z, and singletons. (B) Distribution of inteins does not correlate with genome size. Vertical axis represents genome sizes (black) and frequency of inteins (red; number of inteins per 100 protein-coding sequences) in corresponding phages on the horizontal axis. Mycobacteriophage genomes (365 had protein-coding sequence numbers available) are organized by cluster. Coefficient of determination, R2 = 0.10. (C) Functional genomics of intein-containing proteins. Results for Gene Ontology (GO) term enrichment analysis of dominant functional categories of mycobacteriophage proteins with inteins are compared to those for bacterial intein-containing proteins. GO term enrichment of the 181 mycobacteriophages which gave GO terms (red) and 1,047 bacterial (gray) intein-containing proteins, previously analyzed (5). Dominant GO terms are shown: Nucl bind, nucleotide binding (GO:0003676); hydrolase (GO:0016787); Trans, transferase (GO:0016740); and Ox/Red, oxidoreductase (GO:0016491). The percentages of the associated proteins are indicated above the bars. (D) Intein distribution by host protein and phage clusters. Each square represents one intein. (E) An overview of intein-containing proteins indicates the intein insertion site relative to protein domains (arrow). Intein insertion sites for TerL and Pham3880 are shown in Fig. 2A. Abbreviations: TerL1, large terminase subunit terminase_1; TerL6, terminase_6; Pham3880, terminase-like; RDF, recombination directionality factor; TOPRIM, topoisomerase-primase; NT, DNA nucleotidyltransferase; PORT, portal protein; TdS, thymidylate synthase; HEL, helicase; RecB, RecB-like exonuclease; DNMT-1/2, DNA methyltransferase; Metallophos, metallophosphoesterase domain; Nuc-transf, nucleotidyltransferase domain; DEXDc and HELICc, domains associated with DEAD-like helicases; PD-(D/E)XK, nuclease domain; N6_N4_Mtase, DNA methylase; aa, amino acids.
FIG 2
FIG 2
TerL inteins concentrate in the ATPase domain near functional motifs. (A) Overview of intein insertion sites among three types of terminase-like proteins show that all localize to the ATPase domain. TerL1 has five unique intein insertions, a to e, while TerL6 proteins and Pham3880 each have single insertions, f and g, respectively (red arrows). Values in red squares indicate the number of inteins at each site. P-loop, cyan; Walker B motif (WB), green; C-motif, blue. (B) ATPase structure models of three terminase-like proteins. Insertion sites are shown as red spheres, and motif coloring corresponds to panel A. Models are represented as follows: TerL1, Minerva gp9 (residues 1 to 242); TerL6, Chandler gp6 (residues 59 to 309); Pham3880, ScottMcG gp245 (residues 107 to 317). Full structure models are in Fig. S1 in the supplemental material. (C) Intein insertions mapped onto a TerL pentamer structure. The intein insertion sites were mapped on a solved TerL ATPase domain structure from the virus P74-26 (PDB 4ZNL) (34). Intein insertions are shown once at each site as red spheres and indicated in red on the other monomers. (D) TerL phylogenetic tree. Maximum-likelihood (ML) tree for intein-containing and related intein-free mycobacteriophage and actinobacterial prophage TerLs was constructed. Intein-containing phages and actinobacterial prophages from K. rhizophila and M. smegmatis are indicated in red. Gray shading indicates the bacterial prophage and mycobacteriophage inteins that were compared by protein pairwise alignment, with the percent identity indicated (green). Values for significant external nodes higher than 75% are shown. T4 gp17 and RB49 gp17 are used as an outgroup. Scale indicates the number of substitutions per site. Mycobacteriophage TerL intein insertions (a to f) are indicated.
FIG 3
FIG 3
TerL inteins are splicing competent and can have active endonucleases. (A) MIG reporter system. The intein of interest was cloned between maltose binding protein (MBP) and GFP. Precursor (P) and ligated exteins (LE) are visualized by in-gel fluorescence. (B) TerL inteins are able to splice. Five representative TerL inteins cloned into the MIG reporter were assayed for splicing. Inteins are indicated by the insertion site letter. All five inteins investigated spliced quickly, primarily resulting in LE. Representative inteins are as follows: TerL1-b, BAKA gp6; TerL1-c, Bethlehem gp10; TerL1-e, Gaia gp2; TerL6-f, Chandler gp6; Pham3880-g, ScottMcG gp245. The numbers indicate the marker size in kDa. (C) TerL inteins have endonuclease activity. BAKA TerL1-b and Bethlehem TerL1-c inteins were tested for endonuclease activity against an inteinless TerL sequence from related phages, Courthouse and Solon, respectively. Cleavage products (black arrowheads) for both are ~1.3 kb and 0.4 kb. The DNA substrate was mixed with buffer, lysate with overexpressed unrelated TerL intein, or lysate with overexpressed related TerL intein. The numbers indicate the marker size in kb. (D) Sequence identity at TerL intein insertion sites. Sequence flanking the TerL intein insertion site (20 nucleotides up- and downstream) for each phage pair was analyzed, with high sequence identity among pairs (BAKA-Courthouse, 75%; Bethlehem-Solon, 100%). Conservation is shown by shades of gray. Data are representative of at least three independent experiments.
FIG 4
FIG 4
Putative horizontal transfer of inteins. (A) Evidence of common ancestry among phage and mycobacterial inteins. Phylogenetic analysis (ML) of class 3 mycobacteriophage/mycobacterial inteins (left) and their ATPase-containing exteins (right), excluding RecB. The intein tree shows two examples of supported clustering (red), including mycobacteriophage TerL1-c/e and mycobacterial DnaB-b inteins, indicating a common ancestor. The exteins group independently from their inteins. Inteins were aligned based on splicing blocks; exteins were aligned based on the ATPase domain. Full trees for class 1 and 3 inteins are in Fig. S2 in the supplemental material. (B) Putative horizontal transfer of TdS inteins. Phylogenetic analyses of TdS inteins (left) and TdS proteins (right), some with inteins. Incongruence in clustering of the two trees implies horizontal intein transfer (red). The presence of an intein is indicated by its insertion site a or c. For both panels, trees are unrooted and values for significant external nodes higher than 75% are shown. Scale indicates the number of substitutions per site. Genus abbreviations are as follows: M, Mycobacterium; S, Streptomyces; K, Kitasatospora; Rh, Rhodococcus; N, Nocardia; Mi, Microbacterium; G, Gordonia.
FIG 5
FIG 5
Lack of penultimate residue conservation among mycobacteriophages is modulatory. (A) Differences between class 1 and 3 inteins. Residues of interest in the splicing blocks for each class are boxed. Class 1 inteins initiate splicing using the first cysteine (1; yellow), which acts as a nucleophile and attacks the preceding amide bond (red arrow). In contrast, class 3 inteins use an internal cysteine in block F (yellow) to initiate splicing (red arrow). Both pathways then proceed to completion (black arrows), resulting in excised intein and ligated exteins. The full mechanism for both classes can be found in Fig. S4 in the supplemental material. (B) Disparity of class 1 intein residue. Logos for class 1 blocks of mycobacteriophage and actinobacterial inteins show key residues (colored). The 1 (block A1) and +1 (block G8) residues are marked. The variation of the penultimate His (block G6) is highlighted (blue arrow; shading). (C) Conservation of class 3 intein residues. Comparison between phage and bacterial sequence logos is similar to that in panel B. The class 3 WCT triplet is indicated by red arrows and shading. (D) Mutation of the RDF penultimate residue to His leads to precursor accumulation. The Bethlehem gp51 RDF intein, with an R157W endonuclease-inactivating mutation, was cloned into the MIG reporter construct (RDF Parental) and the penultimate Gly mutated to the canonical His (RDF G316H). Splicing levels were compared, showing a dramatic increase in P accumulation with the G316H mutant relative to Parental. The numbers indicate the marker size in kDa. (E) Splicing of MIG RDF Parental and G316H over time. MIG RDF Parental and G316H lysates were allowed to splice over time. While RDF Parental has faint visible P, it is primarily processed to LE by time zero. In contrast, the RDF G316H mutant is able to slowly splice over time. There are also higher bands that correspond to disulfide-bonded precursor conformers (C). The numbers indicate the marker size in kDa. (F) Quantitation of MIG RDF splicing. The splicing of RDF parental and G316H over time was quantitated, and the ratios of P+C and LE were plotted. The faint P band visible for RDF Parental was not above background during quantitation. Data are representative of at least three independent experiments.

References

    1. Volkmann G, Mootz HD. 2013. Recent progress in intein research: from mechanism to directed evolution and applications. Cell Mol Life Sci 70:1185–1206. doi: 10.1007/s00018-012-1120-4. - DOI - PMC - PubMed
    1. Paulus H. 2000. Protein splicing and related forms of protein autoprocessing. Annu Rev Biochem 69:447–496. doi: 10.1146/annurev.biochem.69.1.447. - DOI - PubMed
    1. Hirata R, Ohsumk Y, Nakano A, Kawasaki H, Suzuki K, Anraku Y. 1990. Molecular structure of a gene, VMA1, encoding the catalytic subunit of H(+)-translocating adenosine triphosphatase from vacuolar membranes of Saccharomyces cerevisiae. J Biol Chem 265:6726–6733. - PubMed
    1. Kane PM, Yamashiro CT, Wolczyk DF, Neff N, Goebl M, Stevens TH. 1990. Protein splicing converts the yeast TFP1 gene product to the 69-kD subunit of the vacuolar H(+)-adenosine triphosphatase. Science 250:651–657. doi: 10.1126/science.2146742. - DOI - PubMed
    1. Novikova O, Jayachandran P, Kelley DS, Morton Z, Merwin S, Topilina NI, Belfort M. 2016. Intein clustering suggests functional importance in different domains of life. Mol Biol Evol 33:783–799. doi: 10.1093/molbev/msv271. - DOI - PMC - PubMed

Publication types

MeSH terms