Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;31(12):e4483.
doi: 10.1002/pro.4483.

Analysis of insertions and extensions in the functional evolution of the ribonucleotide reductase family

Affiliations

Analysis of insertions and extensions in the functional evolution of the ribonucleotide reductase family

Audrey A Burnim et al. Protein Sci. 2022 Dec.

Abstract

Ribonucleotide reductases (RNRs) are used by all free-living organisms and many viruses to catalyze an essential step in the de novo biosynthesis of DNA precursors. RNRs are remarkably diverse by primary sequence and cofactor requirement, while sharing a conserved fold and radical-based mechanism for nucleotide reduction. In this work, we expand on our recent phylogenetic inference of the entire RNR family and describe the evolutionarily relatedness of insertions and extensions around the structurally homologous catalytic barrel. Using evo-velocity and sequence similarity network (SSN) analyses, we show that the N-terminal regulatory motif known as the ATP-cone domain was likely inherited from an ancestral RNR. By combining SSN analysis with AlphaFold2 predictions, we also show that the C-terminal extensions of class II RNRs can contain folded domains that share homology with an Fe-S cluster assembly protein. Finally, using sequence analysis and AlphaFold2, we show that the sequence motif of a catalytically essential insertion known as the finger loop is tightly coupled to the catalytic mechanism. Based on these results, we propose an evolutionary model for the diversification of the RNR family.

Keywords: allostery; evo-velocity; evolution; structure prediction.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Phylogeny of the ribonucleotide reductase (RNR) family. (a) RNRs share a common barrel topology composed of two halves, each consisting of a 5‐stranded β‐sheet (βA‐βE and βF‐βJ), that are connected by the so‐called “finger loop” (yellow), which contains a conserved cysteine that has been shown to be the site of the catalytically essential thiyl radical in all biochemically characterized RNRs. RNRs are diversified by N‐ and C‐terminal extensions and insertions between the secondary structure elements in the catalytic barrel. Loops 1–3 (dark green, red, blue) are involved in allosteric regulation, and the gray secondary structure elements are involved in substrate binding (starred). (b) Previously published unrooted phylogeny of 6,779 RNR α sequences. Host organisms of structurally characterized RNRs are shown in circles. In clockwise order, class I: Fj, Flavobacterium johnsoniae; Au, Actinobacillus ureae; Ec, Escherichia coli; Ng, Neisseria gonorrhea; Bs, Bacillus subtilis; St: Salmonella typhimurium; Aa, Aquifex aeolicus; Pa, Pseudomonas aeruginosa; Sc, Saccharomyces cerevisiae; Hs, Homo sapiens; class II: Ll, Lactobacillus leichmannii; Tm, Thermotoga maritima; Rs, Rhodobacter sphaeroides; and class III: T4, Bacteriophage T4; Tm, Thermotoga maritima. The exact sequences for organisms in dashed circles are not in the tree, however their approximate locations are represented by sequences with ≥80% sequence identity. (c) The same phylogeny shown in panel (b), rooted at the midpoint and collapsed according to major clades. Gene‐based nomenclature is shown.
FIGURE 2
FIGURE 2
The ATP‐cone domain likely originated from a single origin. (a) A sequence similarity network (SSN) of ATP‐cone domain sequences isolated from 2,653 ɑ sequences in our dataset. Each node is a single ATP‐cone sequence, and each connecting edge indicates an alignment score greater than 20.5. Circular nodes correspond to the N‐terminal ATP‐cones. Diamond nodes with gray outlines correspond to the second cones in multiple‐ATP‐cone domains, while hexagons with gray outlines correspond to the third. Gray‐filled nodes are not represented on the circumference of the tree in panel B for clarity. (b) Phylogenetic tree from Figure 1, rooted at the midpoint and labeled with color strips corresponding to colors of the major SSN clusters in panel (a). The outer color strips correspond to N‐terminal ATP‐cones (circular nodes in SSN), and the inner color strips correspond to the second copies in multiple‐ATP‐cones (diamond nodes in SSN). Structurally characterized ATP‐cone containing sequences are labeled by host organism: Ec, Escherichia coli; Aa, Aquifex aeolicus; Pa, Pseudomonas aeruginosa; Sc, Saccharomyces cerevisiae; Hs, Homo sapiens. The exact sequence for Pa (dashed circle) is not on the tree, however its approximate location is represented by sequences with 88% sequence identity. (c–e) Evo‐velocity analysis of isolated ATP‐cone sequences. ESM‐1b embedded ribonucleotide reductase (RNR) sequences projected onto a two‐dimensional vector field plot, where the horizontal and vertical axes are UMAP 1 and 2, respectively. Each colored point in the plot corresponds an ATP‐cone sequence. (c) Vector field plot colored by the classification of RNRs from which the ATP‐cone sequence is pruned: class I (blue), class II (yellow), class III (red). (d) Vector field plot colored by pseudotime, a proxy for phylogenetic depth. Indigo (pseudotime = 0) represents ancestral sequences, and yellow (pseudotime = 1) represents sequences that have diverged the most from ancestral sequences. (e) Vector field plot colored by SSN coloring shown in panel (a).
FIGURE 3
FIGURE 3
Topology of the class I ribonucleotide reductase clade provides a parsimonious model for the loss and gain of ATP‐cones. (a) Collapsed view of the class I clade pruned from the tree in Figure 1b. The tree is labeled with SH‐aLRT/UFboot2 supports on important nodes. Well supported branches (SH‐aLRT ≥ 80% and UFboot ≥ 95%) are shown as light green nodes, supports under this are black. The branches are clustered and numbered for reference in the text. The pie charts to the right of each clade show the distribution of superkingdom, where white corresponds to unannotated sequences in UniProt as of Jan 2022 release. (b) The expanded class I clade, colored as in panel (a). In Group 3 (NrdAq/An/Ay/Az), branches shown in light blue correspond to putative class Ic ɑ sequences; otherwise, they are colored in dark blue. Branches in NrdE colored in magenta are putative class Ie sequences. Tips of branches are marked by the number of ATP‐cones on the sequence, if present: one domain (yellow squares), two domains (blue triangles), and three domains (magenta stars). Class I sequences discussed in the text are indicated by circled organism IDs in clockwise order: Hs, Homo sapiens; Fj, Flavobacterium johnsoniae; Au, Actinobacillus ureae; Cb, Clostridium botulinum; Ec, Escherichia coli; Bs, Bacillus subtilis; Sp, Streptococcus pyogenes; Aeu, Aerococcus urinae; St, Salmonella typhimurium; Aa, Aquifex aeolicus; Ct, Chlamydia trachomatis; Pa, Pseudomonas aeruginosa; Sc, Saccharomyces cerevisiae. Organism IDs in dashed circles are not on the tree but their locations are approximated by sequences with ≥80% sequence identity. Classes II, III, and Ø are shown collapsed at the root. Structurally characterized class I ɑ subunits are shown in their active dimeric forms or inhibited higher oligomer forms. The ATP‐cone domain, where present, is colored in gold.
FIGURE 4
FIGURE 4
Diversity of the class II sequences is found in the C‐terminal tail. (a) Class II clade pruned from tree in Figure 1b. Sequences predicted to represent monomeric class II ribonucleotide reductases (RNRs) are shown as peach color strips in the innermost ring of the circumference. The middle color strips correspond to the length of the C‐terminal tails, where light blue is short and black is long, as depicted in panel (b). The outer color strips correspond to the SSN clusters shown in panel (c). (b) The C‐terminal length distribution is bimodal. (c) An SSN of the C‐terminal sequences. Only the largest seven clusters with the most sequences are shown for clarity. (d) Structure prediction of a representative C‐terminal sequence (Thermoaerobacter marianensis) from Cluster 2 (pink) in panel (c). The left domain shows structural homology to the IscU protein. The right domain is a zinc finger. (e) Crystal structure of IscU protein (PDB: 7c8m). (f) Glycyl radical domain of the class III RNR (PDB: 1h79). In panels (d–f), sulfur and iron are colored in yellow and brown, respectively. In the structure shown in panel (f), the glycine associated with radical formation was mutated to an alanine and the Cα carbon is shown as a red sphere.
FIGURE 5
FIGURE 5
Class III diversity is driven by active‐site and finger loop motifs. (a) Class III clade pruned from tree in Figure 1b. The innermost colors trip defines the two major subclassification of class III α subunits based on the number of cystines in the finger‐loop motif: single‐cysteine (black) and double‐cysteine (orange). The middle color strips are green if the finger‐loop motif begins with a methionine. The outer color strips are blue if the active site contains a glutamate as a putative proton source. The gold wedges highlight clades where a glutamate is present but not a methionine in the finger‐loop motif, characteristics associated with only using thioredoxin as the reductant. The purple wedge highlights the subclade that neither contains a glutamate nor a methionine; these sequences have been previously subclassified as NrdD3 and thought to employ a novel reduction mechanism. Sequences with known structures are labeled as T4, Enterobacteria phage T4 (PDB: 1h7a) and Tm, Thermotoga maritima (PDB: 4u3e). (b and c) Slice‐through views of predicted models of the class III α from Coprothermobacter proteolyticus (UniProtID B5Y6I2 is mapped in panel [a]). The finger‐loop is colored yellow, and the cysteine sulfur atoms are shown as spheres. The glycyl radical domain is shown in purple. In panel (b), the finger‐loop is predicted to have both cysteines on the tip, whereas in panel (c), the finger loop is in the same conformation as in the T. maritima structure with the cysteines at the back of the barrel.
FIGURE 6
FIGURE 6
Proposed schematic of ribonucleotide reductase (RNR) functional evolution. The ancestral RNR likely resembled a glycyl radical enzyme with an N‐terminal ATP‐cone (orange bundle), a finger loop (yellow), and a C‐terminal glycyl radical domain (purple ellipse). In class III RNRs, the barrels dimerize such that the active sites point in opposite directions. The last common ancestor of the aerobic RNRs likely had an N‐terminal ATP‐cone and a cysteine‐rich C‐terminus, possibly recruiting a ferritin subunit for radical generation. The aerobic RNRs further specialized into ferritin or AdoCbl (magenta diamond) usage. ATP‐cones were lost in the class Ø RNRs and largely lost in the class II RNRs.

Similar articles

Cited by

References

    1. Torrents E, Aloy P, Gibert I, Rodríguez‐Trelles F. Ribonucleotide reductases: Divergent evolution of an ancient enzyme. J Mol Evol. 2002;55:138–152. - PubMed
    1. Licht S, Gerfen GJ, Stubbe J. Thiyl radicals in ribonucleotide reductases. Science. 1996;271:477–481. - PubMed
    1. Burnim AA, Spence MA, Xu D, Jackson CJ, Ando N. Comprehensive phylogenetic analysis of the ribonucleotide reductase family reveals an ancestral clade. Elife. 2022;11:e79790. - PMC - PubMed
    1. Martínez‐Carranza M, Jonna VR, Lundin D, et al. A ribonucleotide reductase from clostridium botulinum reveals distinct evolutionary pathways to regulation via the overall activity site. J Biol Chem. 2020;295:15576–15587. - PMC - PubMed
    1. Uhlin U, Eklund H. The ten‐stranded beta/alpha barrel in ribonucleotide reductase protein R1. J Mol Biol. 1996;262:358–369. - PubMed

Publication types

LinkOut - more resources