Review

. 2024 Oct 9;124(19):11008-11062.

doi: 10.1021/acs.chemrev.4c00243. Epub 2024 Sep 5.

Engineering Pyrrolysine Systems for Genetic Code Expansion and Reprogramming

Daniel L Dunkelmann^{1

2}, Jason W Chin¹

Affiliations

¹ Medical Research Council Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England, United Kingdom.
² Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany.

PMID: 39235427
PMCID: PMC11467909
DOI: 10.1021/acs.chemrev.4c00243

Review

Engineering Pyrrolysine Systems for Genetic Code Expansion and Reprogramming

Daniel L Dunkelmann et al. Chem Rev. 2024.

. 2024 Oct 9;124(19):11008-11062.

doi: 10.1021/acs.chemrev.4c00243. Epub 2024 Sep 5.

Authors

Daniel L Dunkelmann^{1

2}, Jason W Chin¹

Affiliations

¹ Medical Research Council Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England, United Kingdom.
² Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany.

PMID: 39235427
PMCID: PMC11467909
DOI: 10.1021/acs.chemrev.4c00243

Abstract

Over the past 16 years, genetic code expansion and reprogramming in living organisms has been transformed by advances that leverage the unique properties of pyrrolysyl-tRNA synthetase (PylRS)/tRNA^Pyl pairs. Here we summarize the discovery of the pyrrolysine system and describe the unique properties of PylRS/tRNA^Pyl pairs that provide a foundation for their transformational role in genetic code expansion and reprogramming. We describe the development of genetic code expansion, from E. coli to all domains of life, using PylRS/tRNA^Pyl pairs, and the development of systems that biosynthesize and incorporate ncAAs using pyl systems. We review applications that have been uniquely enabled by the development of PylRS/tRNA^Pyl pairs for incorporating new noncanonical amino acids (ncAAs), and strategies for engineering PylRS/tRNA^Pyl pairs to add noncanonical monomers, beyond α-L-amino acids, to the genetic code of living organisms. We review rapid progress in the discovery and scalable generation of mutually orthogonal PylRS/tRNA^Pyl pairs that can be directed to incorporate diverse ncAAs in response to diverse codons, and we review strategies for incorporating multiple distinct ncAAs into proteins using mutually orthogonal PylRS/tRNA^Pyl pairs. Finally, we review recent advances in the encoded cellular synthesis of noncanonical polymers and macrocycles and discuss future developments for PylRS/tRNA^Pyl pairs.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following competing financial interest(s): J.W.C. is a founder of Constructive Bio. D.L.D is a consultant for Constructive Bio.

Figures

**Figure 1**
**Encoded cellular incorporation of Pyl at amber codons, via natural genetic code expansion. a**, The chemical structure of **Pyl**. b, The amber suppressor tRNA, tRNA^Pyl_CUA, is selectively charged by PylRS with **Pyl**. EF-Tu transports the aminoacylated pyl-tRNA^Pyl_CUA to the ribosome, where **Pyl** is site-specifically incorporated into a protein in response to an amber stop (UAG) codon in the mRNA. Adapted from Dunkelmann et al. – copyright © The Author(s) 2024 CCBY http://creativecommons.org/licenses/by/4.0/.

**Figure 2**
**Pyl biosynthesis is mediated by***pylBCD***. a**, Operon structure of the gene cluster *pylTSBCD* in the archaeon *Methanosarcina acetivorans* and the bacterium *Desulfitobacterium hafniense*. b, Biosynthetic pathway of **Pyl** from two molecules of lysine mediated by pylBCD. First, the radical S-adenosyl methionine (SAM) enzyme PylB converts lysine into (3R)-3-methyl-D-ornithine (**3MO**). Subsequently PylC ligates a second lysine to **3MO** and PylD oxidizes the terminal amine of the conjugate to an aldehyde. The pyrroline ring is then spontaneously formed by a condensation reaction.

**Figure 3**
**PylRS/tRNA**^Pylpair nomenclature. a, Division of PylRS/tRNA^Pyl pairs into three groups and five classes. The groups are defined by the architecture of the PylRS enzyme. The + N group contains PylRS enzymes where PylRSn and PylRSc are covalently connected by a flexible linker, the Δgroup is comprised of PylRS enzymes lacking PylRSn in their host genome, and the sN group is composed of PylRS enzymes where PylRSn and PylRSc are produced *in trans* from distinct genes. The classes (N, A, B, C, and S) represent a finer subdivision of the pyl system based on sequence identity clustering of PylRS-, and tRNA^Pyl sequences and the aminoacylation specificity of the PylRS/tRNA^Pyl pairs with respect to each other. b, Nomenclature used for PylRS enzymes and pyl tRNAs in this review. The tRNA nomenclature is in line with International Union of Pure and Applied Chemistry (IUPAC) rules and extended to include pyl class information, as well as tRNA^Pyl variant information. We note that when referring to tRNA^Pyl in plural, we write pyl tRNAs, in accordance with IUPAC rules. c, Numbering of residues in tRNA^Pyl. The numbering is in line with the general convention for tRNAs according to Sprinzl et al. However, some common nucleotides are missing in pyl tRNAs (9, 16, 17, 18), and some unusual nucleotides are present (25a, 25b, 42a). Nucleotides in dark gray are present in all described pyl tRNAs, nucleotides in light gray are present in some pyl tRNAs.

**Figure 4**
**Pyl PNP-AMP and Pyl-AMP binding in the active site of N**⁺-MmPylRS. a, Binding of **Pyl** and PNP-AMP in the deep hydrophobic pocket of the active site of N-MmPylRSc (PDB 2ZCE). Direct hydrogen bonds are formed between the primary (backbone) carbonyl of **Pyl** and R330 as well as the secondary carbonyl (side chain) of **Pyl** and N346. The α-amine forms a hydrogen bond with a coordinated water molecule. **Pyl** and the interaction partners are shown as sticks representation, N-MmPylRSc is shown as a transparent electrostatic surface (red negatively charged, white noncharged, blue positively charged). b, Recognition of **Pyl**-AMP by N-MmPylRSc (2Q7H).Pyl-AMP forms the same direct hydrogen bonding network with N-MmPylRSc as observed for **Pyl** in the structure shown in panel a with an additional hydrogen bond being formed between the α-amine of **Pyl** and Y384. Y384 is part of a flexible loop which closes the active site and was not visible in the crystal structure depicted in panel a. **Pyl**-AMP and the interacting amino acids within PylRS are shown in stick representation, PylRS is shown as a transparent electrostatic surface (red negatively charged, white noncharged, blue positively charged).

**Figure 5**
**The PylRS:tRNA**^Pylbinding interface. a, Crystal structure of N-MmPylRSn bound to N-MmtRNA^Pyl (PDB 5UD5). N-MmPylRSn interacts with the variable loop (dark blue), D-stem (cyan), T-loop (purple), and T-stem (light blue) of N-MmtRNA^Pyl. b, Crystal structure of S^Δ-DhPylRS in complex with S-DhtRNA^Pyl (PDB 2ZNI). S^Δ-DhPylRS forms a dimer in the crystal structure and *in vivo* where each protomer (colored in two shades of green) predominantly interacts with one tRNA^Pyl, while forming some interactions with the second tRNA^Pyl.

**Figure 6**
**Double-sieve selection strategy for directed evolution of ncAA specificity in orthogonal aaRS enzymes**. Aminoacyl-tRNA synthetase libraries are first submitted to a round of positive selection in the presence of the target ncAA. In this step, the acylation of a suppressor tRNA and ribosomal translation through an amber codon in a positive selection marker mRNA (frequently chloramphenicol acetyltransferase) is linked to cell survival. Next, surviving library members are submitted to a negative selection step in the absence of the ncAA, where acylation of a suppressor tRNA and ribosomal translation through an amber codon in a negative selection marker (frequently barnase) is linked to cell death. Aminoacyl-tRNA synthetase variants that selectively charge the target ncAA, and no canonical amino acids, onto their cognate tRNA survive both steps of selection. For this selection approach to work, ncAAs must function with the ribosome and other translation factors. Additional rounds of selection can be performed. Adapted with permission from Chin et al. – copyright © 2014 Annual Reviews.

**Figure 7**
**Generation of chimeric aminoacyl-tRNA synthetase/tRNA (chRS/chtRNA) pairs for genetic code expansion**. The tRNA binding function of PylRS enzymes can be coupled to the catalytic domain of certain canonical aaRS enzymes forming orthogonal chRS/chtRNA pairs. Like N⁺-MmPylRS, *E. coli* histidyl-tRNA synthetase (EcHisRS) has two distinct domains connected by a flexible linker. One domain is responsible for tRNA binding (C-terminal domain - HisRSc), and one for the catalytic activity (N-terminal domain - HisRSn). The catalytic domain (CD) predominantly interacts with the acceptor stem of EctRNA^His and the tRNA binding domain (TBD) predominantly interacts with the anticodon stem and loop of EctRNA^His. A chRS is generated through the fusion of the TBD of N⁺-MmPylRS (which predominantly interacts with the T-, and D-stem, as well as the T- and variable loop of N-MmtRNA^Pyl) with the CD of EcHisRS. The combination of the chRS with the engineered chtRNA, where the acceptor stem in N-MmtRNA^Pyl was replaced with the acceptor stem of EctRNA^His, resulted in an orthogonal chRS/chtRNA pair. This pair combined the aminoacylation specificity of EcHisRS with the tRNA recognition and orthogonality of N⁺-MmPylRS and could be used in prokaryotic and mammalian cells. A chimeric PheRS/tRNA^Phe pair was also engineered. Adapted from Ding et al. – copyright © The Author(s) 2020 CCBY http://creativecommons.org/licenses/by/4.0/.

**Figure 8**
**PylRS/tRNA**^Pylpairs are orthogonal and have been used for ncAA incorporation across all domains of life. PylRS/tRNA^Pyl pairs can be engineered for ncAA specificity in *E. coli* cells, using directed evolution approaches, and then used for genetic code expansion in diverse host organisms including bacteria, archaea, eukaryotic cells, a plant species, and several animals.

**Figure 9**
**Virus-assisted directed evolution of tRNAs (VADER) in mammalian cells:** tRNA^Pyl libraries are encoded in the DNA of adeno-associated viruses 2 (AAV2), such that each virus (hexagon) only carries one tRNA variant. The replication of the virus is coupled to amber suppression. N⁺-MmPylRS dependent amber suppression leads to the incorporation of an azide functionality on the surface of the virus. Viruses harboring a selective and active N-MmtRNA^Pyl can be isolated on streptavidin beads by bio-orthogonal labeling and either submitted to additional rounds of evolution, or further characterization by single colony sequencing or NGS. Adapted from Jewel et al. – copyright © The Author(s) 2020 CCBY-NC-ND https://creativecommons.org/licenses/by-nc-nd/4.0/.

**Figure 10**
**Exploiting substrate promiscuity and engineering the pyl pathway for ncAA incorporation. a**, PylC and PylD can accept D-ornithine (DO), or S-ethynyl-D-ornithine (EO) forming desmethylpyrrolysine (**dPyl**) with or without an alkyne substituent on the pyrroline ring. The imine of **dPyl** can be functionalized with 2-amino-benzaldehyde (**2-ABA**) or 2-amino-acetophenone (**2-AAP**) derived compounds. Furthermore, the alkyne can be labeled in bio-orthogonal reactions resulting in the double labeling of recombinant proteins. b, PylC can be engineered to accept d-cysteine forming D-cysteinyl-ε-lysine (**CεK**), which can be incorporated into recombinant proteins by an engineered N⁺-MbPylRS/N-MbtRNA^Pyl pair and used to cyclize proteins which contain intein-derived C-terminal thioesters.

**Figure 11**
**PylRS-mediated genetic encoding of ncAAs corresponding to post-translationally modified forms of canonical amino acids. a**, Acylated derivatives of lysine for which PylRS variants have been evolved. The structures include non-natural PTM mimetics; trifluoroacetyl-lysine (not shown) can also be incorporated. b, Strategy to genetically direct the succinyl-lysine (**SucK**) and glutaryl-lysine (**GluK**) in proteins. c, Protected versions of N^ϵ-methyl-L-lysine (**MeK**), which have been genetically encoded into proteins. Following incorporation, the ncAAs have been deprotected to produce proteins with monomethylated lysine residues at specific sites in proteins. **d, e**, Strategies to generate proteins bearing dimethylated lysine residues. **f to i**, Strategies to generate site-specifically ubiquitinated proteins containing one to two non-natural linkages between lysine and ubiquitin. j, Strategy to genetically direct a natural lysine-ubiquitin linkage.

**Figure 12**
Non-canonical amino acids, corresponding to caged canonical amino acids, that can be genetically encoded into proteins and deprotected to the corresponding canonical amino acids. a, Schematic representation of the deprotection of ncAAs, corresponding to caged canonical amino acids, to canonical amino acids. The deprotection can be done with light (orange structures in panels b-h), by the addition of a small molecule (blue structures in panels b-h, with deprotecting agent in gray), or enzymatically (purple structures). b to h Genetically encoded ncAAs that can be deprotected to lysine, tyrosine, (homo)-cysteine, selenocysteine, aspartic acid, histidine and tryptophan derivatives, respectively.

**Figure 13**
**Bio-orthogonal handles that have been genetically encoded using (engineered) PylRS/tRNA**^Pylpairs for protein labeling. a to e, Schematic representations of some commonly used bio-orthogonal reactions for genetic code expansion mediated site-specific labeling of proteins. (a) CuAAC, (b) SPAAC, (c) KHC, (d) ACC, and (e) iEDDA. f, Structure of the some of the most-important bio-orthogonal handles used to date. Noncanonical amino acids bearing **Nor**, **TCO**, **Cyp**, as well as **BCN** groups have all been genetically encoded using engineered PylRS/tRNA^Pyl pairs and extensively used for iEDDA-based labeling, and SPAAC-based labeling. g, List of lysine derivatives bearing a variety of bio-orthogonal handles including azides, ketones, tetrazine, strained alkene, and (strained) alkynes. h, As in (g) but for alanine derivatives. i, As in (g) but for phenylalanine derivatives. Parts of this figure are reprinted with permission from Lang, K.; Chin, J. W. Bioorthogonal Reactions for Labeling Proteins. ACS Chem. Biol. 2014, 9, 16–20 - copyright © 2014 American Chemical Society.

**Figure 14**
**Site-specific double labeling of proteins. a**, Schematic representation of a protein cyclization mediated by encoded ncAAs containing bio-orthogonal groups, together with the chemistry used in the first genetically programmed, ncAA-mediated, protein cyclization. b, Schematic representation of protein double labeling, distinct encoded ncAAs (stars) are labeled with complementary functional groups in sequential or one pot, concerted labeling reactions. Examples of protein double labeling are shown. The bio-orthogonal handles are colored with respect to the bio-orthogonal reaction for which they were used in the initial examples: ACC (gray), CuAAC (light blue), KHC (purple), iEDDA (red), SPAAC (blue), and CRACR (orange). Labeling was sequential (S) or concerted (C). The exact reaction partners and conditions for each reaction are provided in the indicated references and we note that the mutual orthogonality of many reactions will rely on the exact molecules used, and the reaction conditions. For sequential labeling the order of labeling is indicated. Citations labeled in describe reactions performed in mammalian cells. c, Schematic representation of the encoding of a PTM together with a bio-orthogonal handle; the ncAA corresponding to a post translationally modified canonical amino acid (green star) can bind its specific readers (green blob) and the PTM containing protein can be enriched using bio-orthogonal reactions. A specific example, which was established for potential applications in mammalian cells is shown. d, Genetically encoding single ncAAs bearing two bio-orthogonal handles for protein double, and triple labeling. The bio-orthogonal handles are colored with respect to the bio-orthogonal reaction for which they were used: KHC (purple), iEDDA (red) and SPAAC (blue). Labeling order of functional groups is specified when not concerted (C).

**Figure 15**
**Protein cross-linking using genetically encoded ncAAs. a**, Schematic representation of method to identify protein–protein interactions by genetic code expansion mediated site-specific installation of cross-linking ncAAs. Proteins bearing an either photo- (orange), or proximity- (purple) inducible cross-linking ncAA are used in cells to capture interaction partners. The cross-linked proteins are analyzed (by gel electrophoresis or mass spectrometry-based methods) to define interaction partners of the target proteins and the sites of interaction. In some cases cross-links between specific proteins have been used to stabilize complexes for structural studies. b, Phenylalanine derived cross-linking ncAAs incorporated by engineered PylRS variants. c, Lysine derived cross-linking ncAAs incorporated by engineered PylRS variants. d, Schematic representation of the bait and pray concept for a trifunctional ncAA; the bait protein, yellow, contains a ncAA with a cleavable linker, a photo cross-linking functionality and a bio-orthogonal handle. Illumination activates the cross-linker to cross-link to a prey protein, the bait protein is released to facilitate analysis of the resulting stump on the prey protein, and the bio-orthogonal handle is used to pull down the covalent bait-prey complex for analysis by MS. Examples of ncAAs used for this approach, in some cases the site of cleavage at Se is also used for capture and enrichment. e, Schematic representation of a bifunctional ncAA containing both a photo-cross-linking functionality and a PTM; the PTM interacts with a binding protein which is covalently captured, upon illumination, by cross-linking. f, Distinct ncAAs bearing photo-cross-linkers and PTMs of canonical amino acids can be encoded in a single protein; this provides an alternate route to capturing PTM specific interactions.

**Figure 16**
**Trapping acyl-enzyme intermediates by PylRS mediated genetic encoding of photocaged 2,3-diaminopropionic acid** (**pcDap). a**, Active site serines or cysteines react with the carbonyl groups of their target forming an acyl-enzyme intermediate. The active enzyme is regenerated by the nucleophilic substitution of the intermediate with a hydroxyl, amine, or thiol functionality (R3). b, By introducing **Dap**, instead of the catalytic cysteine or serine, a cleavage resistant acyl-intermediate may be formed. c, Light activation of a genetically encoded **pcDap** to **Dap**. d, Genetically encoding **pcDap** and conversion to **Dap** in target proteins enables the N-terminal fragment of protein and peptide substrates (or the analogous portion of other classes of substrate molecules) to be covalently captured. *In vitro* experiments with defined substrates enables structural studies of acyl-enzyme intermediates. Experiments in cell lysates or live mammalian cells, using tagged hydrolases, enable substrate identification by MS. Panels a and b adapted with permission from Huguenin-Dezot et al. - copyright © 2018 Springer Nature Limited.

**Figure 17**
Genetic code expansion mediated translation control for the production of attenuated viruses and the ncAA induced restoration of circadian rhythms. a, Viral genomes were engineered to contain UAG stop codons in essential viral protein coding genes. Viruses were produced in host cells encoding PylRS/tRNA^Pyl pairs. The production of the viruses was dependent on the presence of the PylRS/tRNA^Pyl as well as its ncAA substrate. The viruses were then harvested and used to vaccinate animals, because the animals do not contain amber suppressor tRNAs the viruses cannot reproduce in the animal; this approach provides a strategy for generating attenuated viruses for immunization. Adapted with permission from Chin et al. - copyright © 2017 Springer Nature Limited. b, By making the production of a regulatory protein of circadian rhythms (Cry1) dependent on the ncAA induced N⁺-MmPylRS/N-MmtRNA^Pyl pair mediated translation, the circadian rhythm of otherwise arhythmic (Cry1/2 null) mice, as measured by their wheel running behavior, can be induced by providing the mice with the ncAA in their drinking water. When the ncAA is withdrawn, the circadian rhythm is switched off again. The data shows a circadian reporter (per2-luciferase) as measured by luciferase levels in suprachiasmatic nucleus slices derived from Cry1/2 null mice. When the ncAA was added to the culture medium the Cry1-dependent molecular clockwork was initiated. Adapted from Maywood et al. - copyright © The Author(s) 2018 CCBY http://creativecommons.org/licenses/by/4.0/.

**Figure 18**
**Hydroxy acids incorporated by PylRS/tRNA**^Pylpairsin vivo: Chemical structures of hydroxy acids that have been genetically encoded into proteins with PylRS/tRNA^Pyl pairs. Only examples for which the incorporation has been confirmed by MS are listed. The hydroxy acids **BocK–OH**, **AllocK–OH**, **AlkynK–OH**, **ButK–OH**, **PenK–OH**, **NorK–OH**, **CbzK–OH** and **AcK–OH** are all substrates for wt N⁺-MmPylRS. The wt A^Δ*-1r26*PylRS has a similar substrate scope, but does not recognize **NorK–OH**, and **AcK–OH**. The engineered N⁺-MbPylRS(L274A, C313 V) charges **THFK–OH**, **CbzK–OH**, **AllocK–OH**, and **BocK–OH**. The promiscuous PylRS enzymes N⁺-MbPylRS(L274A, C313 V), and A^Δ*-alv*PylRS(N166A, V168A), charge the aromatic residues m-BrF–OH and **Alkyn-F–OH**, or m-CF₃**F–OH**, respectively. PylRS enzymes were specifically evolved to charge aromatic hydroxy acids, remarkably those enzymes do not show a measurable background incorporation with aromatic canonical amino acids. The two aromatic hydroxy acid selective mutants N⁺-MmPylRS(M300S, A302H, M344L, N346A) and N⁺-MmPylRS(M300S, A302H, M344L, N346A, C348S, V401L, W417T) both charge **F–OH**, p-IF–OH and **NapA–OH**, with the latter being more active with the bulkier substrates, and the former more active with **F–OH**.

**Figure 19**
**Transfer RNA extension protocols for the analytical, and physical separation of acylated tRNAs from free tRNAs. a**, Isolated tRNAs are oxidized with sodium periodate, during which the diol functionality of the 3′-ribose of acylated tRNAs is protected from oxidation to the aldehyde. A DNA probe bearing a 3′Cy3 label is annealed to the tRNA and the tRNA is extended by Klenow (exo-) DNA polymerase fragment. For fluoro- and bio-tREX a DNA probe that has a 5′poly G stretch, and is otherwise devoid of G, is used together with an extension mix that contains dNTPs without dCTPs and is supplemented with either biotinylated- or Cy5 labeled dCTPs. b, In tREX, the difference in mass between the extended (previously acylated) and nonextended (previously nonacylated) tRNAs is visualized by gel electrophoresis to identify acylated tRNAs. c, In fluoro-tREX the previously acylated tRNAs are labeled with Cy5 and can be visualized by gel electrophoresis. d, In bio-tREX, the previously acylated tRNAs are biotinylated, bound to streptavidin beads and can then be eluted and analyzed by gel electrophoresis. Adapted from Dunkelmann et al. – copyright © The Author(s) 2024 CCBY http://creativecommons.org/licenses/by/4.0/.

**Figure 20**
**Methodological basis of the tRNA display platform. a**, Assembly, maturation and acylation of split tRNAs *in vivo*. Pyrrolysyl tRNA can be split, at the anticodon, into two halves and expressed as circularly permutated tRNA from one construct in cells. The split tRNA is recognized and acylated by PylRS *in vivo*. b, Split tRNA-mRNA fusions (stmRNAs) can be produced from one transcript by circular permutation of the split tRNA and attaching the PylRS mRNA to the 5′ end of the 3′ half of the tRNA. Split tRNA-mRNA fusions serve as the mRNA for the production of PylRS enzymes, as well as tRNA substrates of PylRS, thereby connecting genotype (PylRS mRNA) to phenotype (acylated tRNA^Pyl). c, Biotin mRNA extension (bio-mREX) leads to the selective isolation of the DNA sequences of active PylRS enzymes. Split tRNA-mRNA fusions are oxidized with sodium periodate. During the oxidation the 3′ end of acylated stmRNAs is protected, while the 3′ end of free stmRNAs is inactivated. A DNA probe is annealed to the stmRNA, extended and the stmRNAs with intact 3′ ends are biotinylated. Therefore, only formerly acylated stmRNAs get biotinylated. The biotinylated stmRNAs are isolated on streptavidin beads, reverse transcribed, and either submitted to quantitative PCR (qPCR) or cloned into a new backbone for multiple rounds of selection. Adapted from Dunkelmann et al. – copyright © The Author(s) 2024 CCBY http://creativecommons.org/licenses/by/4.0/.

**Figure 21**
**Evolution of β-aminoacyl-, β -hydroxyacyl-, and α-, α-disubstituted aminoacyl-tRNA synthetases by tRNA display. a**, Schematic representation of the tRNA display protocol. Split tRNA-mRNA fusions encoded PylRS active site libraries are submitted to two parallel bio-mREX experiments, in presence and absence of the target ncM. Isolated cDNA is submitted to NGS. Two parameters are determined: (1) Enrichment–a proxy for acylation activity–which is calculated as the ratio of the relative abundance of a sequence after the selection, divided by the relative abundance of the same sequence in the input library. (2) Selectivity–a proxy for acylation specificity–which is calculated as the relative abundance of a sequence after selection in the presence of the ncM, divided by the relative abundance of the same sequence after selection in absence of the ncM. Enrichment and selectivity are plotted in spindle plots, and highly enriched and selective sequences further characterized. b, Schematic representation of substrates for evolved N⁺-MmPylRS variants. Substrates (S)-β³-m-BrF, (S)-β³-m-CF₃F, (S)-β³-p-BrF, and (S)-α-Me-p-IF–OH were site-specifically genetically encoded in a protein. c, Crystal structure (PDB 8OVY) of β-amino acid **(S)- β**³-m-BrF at position 150 in green fluorescent protein incorporated with an evolved N⁺-MmPylRS/N-MmtRNA^Pyl pair. Adapted from Dunkelmann et al. – copyright © The Author(s) 2024 CCBY http://creativecommons.org/licenses/by/4.0/.

**Figure 22**
**Naturally mutually orthogonal PylRS/tRNA**^Pylpairs in archaea. a, The archaeon *Candidatus Methanohalarchaeum thermophilum* (*therm*) harbors two distinct PylRS/tRNA^Pyl pairs in its genome. The two pyl tRNAs carry distinct discriminator bases (DBs). b, The intraorganism mutual orthogonality between *therm*(1)PylRS/*therm*tRNA^Pyl(1) pair and *therm*(2)PylRS/*therm*tRNA^Pyl(2) is derived from the combination of an unusual DB in the *therm*tRNA^Pyl(1) which is only recognized *therm*(1)PylRS that has a shortened motif 2, and is orthogonal to *therm*tRNA^Pyl(2). Experiments defining mutual orthogonality were carried out in the model archaeon *Haloferax volcanii*.

**Figure 23**
**Engineering mutually orthogonal PylRS/tRNA**^Pylpairs inE. coli. a, Structure of N-MmPylRSn (pink) bound to MmtRNA^Pyl (gray) (5UD5). The interaction interface of N-MmPylRSn with the variable loop (blue) is dependent on the presence of the N-terminal domain. The absence N-PylRSn and therefore variable loop recognition in group ΔN PylRS enzymes provides a direct means to engineering PylRS specificity by variable loop extension, impeding the interaction interface with N⁺-MmPylRS. b, Depiction of A-*alv*tRNA^Pyl libraries with extended anticodon loops. Nucleotides marked in blue were randomized, and changes to the nucleotides marked in yellow emerged as unprogrammed mutations during the selection. c, Schematic representation of the strategy employed to generate mutually orthogonal PylRS/tRNA^Pyl pairs formed of N⁺-MmPylRS/N-MmtRNA^Pyl and A^Δ-*alv*PylRS/A-*alv*tRNA^Pyl by (1) identifying A-*alv*tRNA^Pyl variants with an expanded variable loop that are active with A^Δ-*alv*PylRS, and (2) screening against cross-reactivity with N⁺-MmPylRS.

**Figure 24**
**Engineering triply orthogonal PylRS/tRNA**^Pylpairs inE. coli. a, The screen of natural variance in N-pyl tRNAs led to the discovery that N-*spe*tRNA^Pyl is for class N tRNA^Pyl triply orthogonal sets. b, Depiction of previously engineered A-*alv*tRNA^Pyl variable loop extension mutants. Three mutants fulfilled the requirements to form the A-pyl tRNAs in triply orthogonal sets. c, Directed evolution strategy for identifying B-*int*tRNA^Pyl variants completing the triply orthogonal set. Acceptor stem and variable loop libraries were run independently and the most orthogonal B-*int*tRNA^Pyl variants combined into hybrid B-pyl *int*tRNAs. Multiple B-pyl *int*tRNAs fulfilled the orthogonality requirement to form triply orthogonal sets when paired with a B^Δ-PylRS, a select class A N^Δ PylRS/tRNA^Pyl pair, and N+MmPylRS/N-*spe*tRNA^Pyl. d, Summary of the experimental strategy to generate triply orthogonal PylRS/tRNA^Pyl pairs including a depiction of all interactions that were controlled in the process (red arrows indicate undesired activity, gray dashed arrows orthogonality). Adapted with permission from Dunkelmann et al. - copyright ©2020 Nature Springer Limited.

**Figure 25**
**The archetypical tRNA structures for pyl tRNAs of classes N, A, B, C, and S.** Important nucleotides and features are highlighted in the class-specific colors and labeled. Adapted with permission from Beattie et al.- copyright © 2023 Nature Springer Limited.

**Figure 26**
**Engineering quintuply orthogonal PylRS/tRNA**^Pylpairs inE. coli. Summary of the experimental strategy to generate quintuply orthogonal PylRS/tRNA^Pyl pairs including a depiction of all interactions that were controlled in the process (red arrows indicate undesired activity, gray dashed arrows orthogonality). Starting from a rationally chosen set of five PylRS/tRNA^Pyl pairs composed of one pair of each pyl class N, A, B, C and S, a series of screens and directed evolution experiments resulted in the generation of quintuply orthogonal sets. Quintuply orthogonal pyl tRNAs for B^Δ-PylRS and S⁺-PylRS, respectively, were identified by a round of positive selection of the depicted tRNA^Pyl library, which is based on the S-i2tRNA^Pyl scaffold, in the presence of either B^Δ-PylRS or S⁺-PylRS followed by negative screens against all other classes of PylRS enzymes (N, A, C and S or N, A, B and C, respectively).

**Figure 27**
Transcription and translation with synthetic bases. a, Structure of synthetic base pair (d)NaM(dX):(d)TPT3(dY). b, A heterologous nucleotide transporter enables the uptake of the triphosphate of dX, dY, X and Y in *E. coli*. A plasmid harboring a tRNA gene with an anticodon containing a synthetic codon, as well as a gene using a synthetic codon in its sequence is transformed and maintained in *E. coli*. The gene and tRNA harboring synthetic codons are transcribed and used by the ribosome in translation. Adapted with permission from de la Torre et al. - copyright © 2021 Nature Springer Limited.

**Figure 28**
**Genetic encoding of four distinct ncAAs using a 68-codon genetic code.** The genetic incorporation of four distinct ncAAs requires the control of the orthogonality of the engineered translational machinery on several levels. First, active sites need to be engineered for each aaRS which are selective for the target ncAA, and exclude all other ncAAs as well as the canonical amino acids. Second, multiply orthogonal aaRS/tRNA pairs need to be engineered, which are compatible with each other. Third, each tRNA needs to be addressed to a distinct codon in the genetic code. Finally, the ribosome may need to be engineered to read alternative codons, or polymerize novel ncMs. By combining engineered triply orthogonal PylRS/tRNA^Pyl pairs with an orthogonal AfTyrRS/AftRNA^Tyr pair, addressing each active site to a specific ncAA, and addressing each tRNA to a specific quadruplet codon, four distinct ncAAs were successfully encoded in *E. coli* using an evolved quadruplet decoding ribosome (RiboQ1). Adapted with permission from Dunkelmann et al. - copyright © 2021 Nature Springer Limited.

**Figure 29**
**Genetic encoding of two ncAAs using two mutually orthogonal synthetic codons.** Three synthetic codons are orthogonal to each other GYU:AXC, AYC:GXU, and XCU:AGX. The combination of the engineered MjTyrRS/MjtRNA^Tyr_GYU and N⁺-MmPylRS/N-MmtRNA^Pyl_XCU pairs together with an EcSerRS enzyme and an EctRNA^Ser_AYC variant permits the site-specific incorporation of two ncAAs and serine at synthetic-codon defined positions in a protein. In EcSerRS, Ser has been typeset in non-italic.

**Figure 30**
**Genetic encoding of three distinct ncAAs at sense codons in syn61Δ3. a**, Schematic showing the codon compression scheme used to generate syn61 and the steps taken to reassign all three free codons to new ncAAs. b, Genetic encoding of three distinct ncAAs at three distinct sense codons in syn61Δ3. Two mutually orthogonal PylRS/tRNA^Pyl pairs from classes N and A were used together with the AfTyrRS/AftRNA^Tyr pair. Parts of this figure are adapted with permission from Robertson et al. copyright © 2021, some rights reserved; exclusive licensee American Association for the Advancement of Science.

**Figure 31**
**Synthesis of noncanonical polymers in cells. a**, Synthesis of noncanonical polymers as GFP fusions. Synthetic genes for noncanonical polymers were expressed as N-terminal fusions to GFP; the order of codons in the sequence defined the pattern of monomer building blocks in the resulting polymer, one example of a synthetic gene is shown, and reassignmentschemes (r.s. 1–3) define the identity of the monomers. Two mutually orthogonal PylRS/tRNA^Pyl pairs from classes N and A were used for r.s. 1 and of one of the pyl pairs (from either class N or A, respectively) in combination with the AfTyrRS/AftRNA^Tyr pair for r.s. 2 and 3. b, Synthetic genes for the encoded synthesis of free non canonical polymers. An example of a noncanonical hexamer synthesized in cells using recoding scheme 1 is shown. c, Synthetic genes for the encoded synthesis of a noncanonical macrocycle. An example of a cell-based noncanonical macrocycle synthesis using recoding scheme 1 is shown. Figure adapted with permission from Robertson et al. copyright © 2021, some rights reserved; exclusive licensee American Association for the Advancement of Science.

**Figure 32**
**Cell-based synthesis of macrocyclic (depsi)-peptides. a**, Strategy for the encoded cell-based synthesis of artificial macrocycles. The indicated synthetic genes were used with the indicated reassignmentschemes (r.s.). Two mutually orthogonal PylRS/tRNA^Pyl pairs from classes N and A as well as the AfTyrRS/AftRNA^Tyr (Y) pair were used in different combinations; r. s. 1,2, 4, 6 used a class N pyl pair and a Y pair: r. s. 2, 5, 7, 8 used a a class N pyl pair and a class A pyl pair. b, The ten core structures of 37 macrocyclic products from the encoded cell based synthesis that were isolated and characterized by MS. For each core structure the different recoding schemes, according to which the macrocyclic product were synthesized, are indicated. Adapted with permission from Spinck et al. - copyright © The Author(s) 2023 CCBY http://creativecommons.org/licenses/by/4.0/.

See this image and copyright information in PMC

References

1. Liu C. C.; Schultz P. G. Adding New Chemistries to the Genetic Code. Annu. Rev. Biochem. 2010, 79, 413–444. 10.1146/annurev.biochem.052308.105824. - DOI - PubMed
1. de la Torre D.; Chin J. W. Reprogramming the Genetic Code. Nat. Rev. Genet. 2021, 22, 169–184. 10.1038/s41576-020-00307-7. - DOI - PubMed
1. Chin J. W. Expanding and Reprogramming the Genetic Code of Cells and Animals. Annu. Rev. Biochem. 2014, 83, 379–408. 10.1146/annurev-biochem-060713-035737. - DOI - PubMed
1. Chin J. W. Expanding and Reprogramming the Genetic Code. Nature 2017, 550, 53–60. 10.1038/nature24031. - DOI - PubMed
1. Young D. D.; Schultz P. G. Playing with the Molecules of Life. ACS Chem. Biol. 2018, 13, 854–870. 10.1021/acschembio.7b00974. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- American Chemical Society
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Engineering Pyrrolysine Systems for Genetic Code Expansion and Reprogramming

Affiliations

Engineering Pyrrolysine Systems for Genetic Code Expansion and Reprogramming

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Miscellaneous