Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 15:8:15.
doi: 10.1186/1745-6150-8-15.

Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing

Affiliations

Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing

Vivek Anantharaman et al. Biol Direct. .

Abstract

Background: The major role of enzymatic toxins that target nucleic acids in biological conflicts at all levels has become increasingly apparent thanks in large part to the advances of comparative genomics. Typically, toxins evolve rapidly hampering the identification of these proteins by sequence analysis. Here we analyze an unexpectedly widespread superfamily of toxin domains most of which possess RNase activity.

Results: The HEPN superfamily is comprised of all α-helical domains that were first identified as being associated with DNA polymerase β-type nucleotidyltransferases in prokaryotes and animal Sacsin proteins. Using sensitive sequence and structure comparison methods, we vastly extend the HEPN superfamily by identifying numerous novel families and by detecting diverged HEPN domains in several known protein families. The new HEPN families include the RNase LS and LsoA catalytic domains, KEN domains (e.g. RNaseL and Ire1) and the RNase domains of RloC and PrrC. The majority of HEPN domains contain conserved motifs that constitute a metal-independent endoRNase active site. Some HEPN domains lacking this motif probably function as non-catalytic RNA-binding domains, such as in the case of the mannitol repressor MtlR. Our analysis shows that HEPN domains function as toxins that are shared by numerous systems implicated in intra-genomic, inter-genomic and intra-organismal conflicts across the three domains of cellular life. In prokaryotes HEPN domains are essential components of numerous toxin-antitoxin (TA) and abortive infection (Abi) systems and in addition are tightly associated with many restriction-modification (R-M) and CRISPR-Cas systems, and occasionally with other defense systems such as Pgl and Ter. We present evidence of multiple modes of action of HEPN domains in these systems, which include direct attack on viral RNAs (e.g. LsoA and RNase LS) in conjunction with other RNase domains (e.g. a novel RNase H fold domain, NamA), suicidal or dormancy-inducing attack on self RNAs (RM systems and possibly CRISPR-Cas systems), and suicidal attack coupled with direct interaction with phage components (Abi systems). These findings are compatible with the hypothesis on coupling of pathogen-targeting (immunity) and self-directed (programmed cell death and dormancy induction) responses in the evolution of robust antiviral strategies. We propose that altruistic cell suicide mediated by HEPN domains and other functionally similar RNases was essential for the evolution of kin and group selection and cell cooperation. HEPN domains were repeatedly acquired by eukaryotes and incorporated into several core functions such as endonucleolytic processing of the 5.8S-25S/28S rRNA precursor (Las1), a novel ER membrane-associated RNA degradation system (C6orf70), sensing of unprocessed transcripts at the nuclear periphery (Swt1). Multiple lines of evidence suggest that, similar to prokaryotes, HEPN proteins were recruited to antiviral, antitransposon, apoptotic systems or RNA-level response to unfolded proteins (Sacsin and KEN domains) in several groups of eukaryotes.

Conclusions: Extensive sequence and structure comparisons reveal unexpectedly broad presence of the HEPN domain in an enormous variety of defense and stress response systems across the tree of life. In addition, HEPN domains have been recruited to perform essential functions, in particular in eukaryotic rRNA processing. These findings are expected to stimulate experiments that could shed light on diverse cellular processes across the three domains of life.

Reviewers: This article was reviewed by Martijn Huynen, Igor Zhulin and Nick Grishin.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Multiple alignment of the HEPN superfamily. The multiple sequence alignment includes the conserved blocks based on the MUSCLE alignment [45], which was corrected manually on the basis of HHpred [46] and PSI-BLAST results [47]. Due to the low similarity, the alignment of helices 1, 2.1 and 4 should be considered tentative. Secondary structure, which is a consensus between the proteins with solved structures, is shown above the alignment; ‘H’ indicates α-helix. The sequences are denoted by their GI numbers and species names. The HEPN family to which each sequence belongs is indicated after the species name. Positions of the first and the last residues of the aligned region in the corresponding protein are indicated for each sequence. The PDB identifiers for proteins with solved structure are indicated on the right. The numbers (of amino acid residues) within the alignment represent poorly conserved inserts that are not shown. The coloring is based on the consensus shown underneath the alignment; ‘h’ indicates hydrophobic residues (WFYMLIVACTH), ‘p’ indicates polar residues (EDKRNQHTS),‘s’ indicates small residues (ACDGNPSTV). Predicted catalytic amino acids are shown by reverse shading. GI and species name is underlined if the HEPN domain has lost the conserved Rx4-6H motif.
Figure 2
Figure 2
Structural diversity of HEPN domains. A member of each of the seven HEPN families with solved crystal structures is rendered as a cartoon; labels provide HEPN family name and PDB ID. Equivalent core helices are colored the same across all structures while labeled in the order observed from the N-terminus to the C-terminus to highlight circular permutations. In the canonical configuration, helix-1 (H1) and helix-2 (H2) from the first α-hairpin are colored green and blue, respectively and helix-3 (H3) and helix-4 (H4) from the second α-hairpin are colored cyan and yellow, respectively. The conserved insert region found between helix-2 and helix-3 in the canonical configuration is colored and labeled in light grey in each cartoon. The kink and further distortions are labeled in yellow. Conserved active site residues are rendered as ball and sticks and colored and labeled in red. Note the structural reorganization of HEPN domain in the Csx1 family. The distinctive β-hairpins of this family are colored and labeled in brown and the zinc ion found in the vicinity of the active site residues is rendered as a sphere and colored in purple.
Figure 3
Figure 3
A domain architecture and gene-neighborhood network showing the manifold functional connections of the HEPN domain. The graphs were rendered using the Cytoscape program [73]. The network is an ordered graph with the cyan edges representing the connection between adjacent domains combined in the same polypeptides and the gold edges representing the context in the gene neighborhood. (A) The “force-directed” network was derived using the spring-embedded layout utilizing the Kamada–Kawai algorithm, which works well for graphs with 50–100 nodes [74]. The natural clustering of the functional categories emerging from this algorithm is indicated with labels. (B) The nodes of the network arranged by function. (C) Condensed network, where the domain belonging to a given functional category has been collapsed into that category name. (D) A domain architecture graph of HEPN and the various N-terminal domains which co-occur with other defense-related domains, showing the interchangeability of HEPN and the defense-related domains.
Figure 4
Figure 4
Selected domain architectures of HEPN proteins. The domains are not drawn to scale. Domain architectures are labeled with a representative gene name, the Genbank identifier (gi) number, and the species name separated by semicolons. The labels of eukaryotes are colored green. The generic functional categories are shown in red letters. Uncharacterized globular domains of limited phyletic spread are shown with a grey rectangle. Domain names of most domains follow the Pfam database or literature [78] (also see Additional file 1). Non-standard domain abbreviations: Ank – Ankyrin; CARF- CRISPR/Cas-associated Rossmann fold domain; PlipaseD – Phospholipase D; Taminase – Transglutaminase; TM – transmembrane helix; Helical – Helical domain.
Figure 5
Figure 5
Selected gene-neighborhoods of HEPN genes. The gene neighborhood data for some of the genes encoding HEPN domain containing proteins is depicted using arrows. The HEPN gene is marked with an asterisk. The direction of the arrow is the direction of transcription of the gene. The gene name, Genbank identifier (gi), and the species name of the starred gene are shown next to the operon. The multi-gene modules that always co-occur are boxed. The cartoon representations of the genes are not drawn to scale. The depicted operons are typically representative of a types of operons found in a range of diverse organisms. Domain names of most domains follow the Pfam database or literature [78] (also see Additional file 1). Non standard abbreviations: RM_TRD, restriction-modification target recognition domain.

References

    1. Aravind L, Anantharaman V, Zhang D, de Souza RF, Iyer LM. Gene flow and biological conflict systems in the origin and evolution of eukaryotes. Front Cell Infect Microbiol. 2012;2:89. - PMC - PubMed
    1. Zhang D, de Souza RF, Anantharaman V, Iyer LM, Aravind L. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics. Biol Direct. 2012;7:18. doi: 10.1186/1745-6150-7-18. - DOI - PMC - PubMed
    1. Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nat Rev Microbiol. 2010;8(5):317–327. doi: 10.1038/nrmicro2315. - DOI - PubMed
    1. Makarova K, Anantharaman V. L. A, Koonin EV. Live virus-free or die: coupling of antivirus immunity and programmed suicide or dormancy in prokaryotes. Biol Direct. 2012;7:40. doi: 10.1186/1745-6150-7-40. - DOI - PMC - PubMed
    1. Ishikawa K, Fukuda E, Kobayashi I. Conflicts targeting epigenetic systems and their resolution by cell death: novel concepts for methyl-specific and other restriction systems. DNA Res. 2010;17(6):325–342. doi: 10.1093/dnares/dsq027. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources