Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan;625(7995):603-610.
doi: 10.1038/s41586-023-06897-6. Epub 2024 Jan 10.

Adding α,α-disubstituted and β-linked monomers to the genetic code of an organism

Affiliations

Adding α,α-disubstituted and β-linked monomers to the genetic code of an organism

Daniel L Dunkelmann et al. Nature. 2024 Jan.

Abstract

The genetic code of living cells has been reprogrammed to enable the site-specific incorporation of hundreds of non-canonical amino acids into proteins, and the encoded synthesis of non-canonical polymers and macrocyclic peptides and depsipeptides1-3. Current methods for engineering orthogonal aminoacyl-tRNA synthetases to acylate new monomers, as required for the expansion and reprogramming of the genetic code, rely on translational readouts and therefore require the monomers to be ribosomal substrates4-6. Orthogonal synthetases cannot be evolved to acylate orthogonal tRNAs with non-canonical monomers (ncMs) that are poor ribosomal substrates, and ribosomes cannot be evolved to polymerize ncMs that cannot be acylated onto orthogonal tRNAs-this co-dependence creates an evolutionary deadlock that has essentially restricted the scope of translation in living cells to α-L-amino acids and closely related hydroxy acids. Here we break this deadlock by developing tRNA display, which enables direct, rapid and scalable selection for orthogonal synthetases that selectively acylate their cognate orthogonal tRNAs with ncMs in Escherichia coli, independent of whether the ncMs are ribosomal substrates. Using tRNA display, we directly select orthogonal synthetases that specifically acylate their cognate orthogonal tRNA with eight non-canonical amino acids and eight ncMs, including several β-amino acids, α,α-disubstituted-amino acids and β-hydroxy acids. We build on these advances to demonstrate the genetically encoded, site-specific cellular incorporation of β-amino acids and α,α-disubstituted amino acids into a protein, and thereby expand the chemical scope of the genetic code to new classes of monomers.

PubMed Disclaimer

Conflict of interest statement

J.W.C. is a founder of Constructive Bio. The Medical Research Council have filed a patent application based on this work. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Acylation-dependent tRNA extension enables the sensitive detection and isolation of acylated tRNAs.
a, Schematic of fluoro-tREX and bio–tREX protocols. tRNAs are isolated from cells, and the diol functionality of the 3′ ribose on non-acylated tRNAs is oxidized to the dialdehyde. The acyl group of charged tRNAs protects the diol functionality of the 3′ ribose and prevents oxidation to a dialdehyde. A Cy3-labelled DNA probe complementary to the 3′ end of a target tRNA is annealed, and target tRNAs that were acylated are extended upon addition of Klenow fragment exo− and modified nucleotides. For fluoro-tREX, Cy5-labelled nucleotides are incorporated. Acylated tRNAs lead to a Cy5 and Cy3 signal, whereas non-acylated tRNAs only give a Cy3 signal. For bio–tREX, biotinylated nucleotides are incorporated and the tRNAs that were acylated are purified using streptavidin beads and can be visualized by SYBR gold staining following gel electrophoresis. b, Fluoro-tREX detected the acylation of tRNACUAPyl in the presence of PylRS and BocK (1). Cells expressing tRNACUAPyl were grown in the presence and absence of PylRS and BocK (1). c, Bio–tREX enables the selective isolation of previously acylated tRNAs. Cells harbouring tRNACUAPyl were grown in the presence and absence of PylRS and BocK (1). Isolation of the tRNA and associated probe was visualized by SYBR Gold staining for RNA and Cy3 fluorescence for the probe. Experiments in b,c were repeated three times with similar results. For full, uncropped gels for all figures see Supplementary Fig. 1.
Fig. 2
Fig. 2. Production and acylation of split tRNAs expressed from split and circularly permuted genes.
a, Schematic for producing split tRNAs in trans. The tRNA gene is split at the anticodon loop and the anticodon stem sequence is extended for optimal assembly of the transcribed RNA in vivo; this creates two genes: one for the 5′ half and one for the 3′ half of the split tRNA. These genes are transcribed and the split tRNA is assembled, matured and acylated in cells. b, Schematic for producing split tRNAs in cis from a single gene. The tRNA sequence is circularly permutated by connecting the 3′ half, via an intervening loop sequence, to the 5′ half, splitting the sequence at the anticodon and extending the anticodon stem. Transcription, assembly in cis and maturation leads to a functional split tRNA. c, in vivo transcription, assembly, maturation and acylation of split tRNAPyl produced from genes for the 5′ half and 3′ half. Cells were grown in the presence of PylRS. Only the expression of both tRNAPyl halves led to a BocK (1)-dependent acylation signal, as judged by fluoro-tREX. Note that under the purification conditions used to isolate these stRNAs, we do not observe the Cy3 probe. d, Circularly permutated split tRNAPyl with different loop sequences were assayed by fluoro-tREX. For the argYargZ and leuPleuV loops derived from the intergenic regions of pairs of tRNA genes in E. coli, the fluoro-tREX signal for split tRNA production (Cy3) and acylation (Cy5) was comparable to the corresponding signal for intact tRNAPyl (Supplementary Fig. 6). Experiments in c,d were repeated three times with similar results.
Fig. 3
Fig. 3. stmRNAs enable selective isolation of active PylRS variants.
a, Schematic of the cis split tRNA–mRNA fusion (stmRNA) gene, and the production and acylation of stmRNA. b, stmRNA acylation visualized by fluorescent mRNA extension (fluoro-mREX). Cells harboured an stmRNA gene, wild-type (WT) or attenuated (at) PylRS. Positions of 16S (1.5 kb) and 23S rRNA (2.9 kb) are indicated. The fusion between the 3′ half of the tRNA and the mRNA is 1.5 kb. The fluoro-mREX signal was visualized on denaturing gels; representative of three independent replicates. c, Schematic of biotin mRNA extension (bio–mREX). Biotinylated stmRNAs are enriched on streptavidin beads. The mRNA of PylRS within is reverse transcribed on the beads and quantified by qPCR. d, Efficient and selective isolation of active PylRS variant cDNA via bio–mREX. Cells harbouring the indicated stmRNA were grown in the presence or absence of BocK (1) and bio–mREX was performed. Following pulldown and reverse transcription, we determined the number of cDNA molecules. Dashed line indicates 2.5% of input (Supplementary Fig. 8). The bars represent the mean of five biological replicates, dots represent individual data points, and error bars show the s.d. e, Tailoring the RBS of PylRS mRNAs within stmRNAs leads to a stronger correlation between the acylation of stmRNAs and readthrough of an amber stop codon (original RBS: R2 = 0.4694, P = 0.3148; RBS2: R2 = 0.9742, P = 0.013). Bio–mREX was performed from cells harbouring stmRNA genes encoding PylRS(CbzK1–4) with either the original RBS or RBS2 grown in the presence of CbzK (2). The measured cDNA molecules were plotted against the fluorescence intensity of GFP(150CbzK)–His6, resulting from readthrough of the amber codon in GFP(150TAG)His6 by each PylRS variant paired with tRNACUAPyl in cells provided with CbzK (2). Bio–mREX was performed in duplicates and GFP fluorescence was measured in triplicates. Error bars show the s.d. a.u., arbitrary units.
Fig. 4
Fig. 4. tRNA display enables the direct selection of orthogonal aminoacyl-tRNA synthetases that aminoacylate their cognate orthogonal tRNAs with ncAAs.
a, Schematic of tRNA display. A library of aaRSs encoded within stmRNA genes is grown in the presence and absence of non-canonical monomers of interest (yellow star). Bio–mREX is performed and the cDNA is sequenced by NGS. The data are used to generate spindle plots. b, The numbered structures of non-canonical α-amino acids used in this study. c, tRNA display with stmRNAvol2-lib1. Spindle plot from one step tRNA display selection with stmRNAvol2-lib1 and CbzK (2). Samples were run in triplicate and data were processed as described in Methods. Red dots indicate 65 clones that were further characterized. d, Plot of ln(enrichment + 2) of PylRS mutants derived from tRNA display (red dots in c) against GFP fluorescence from cells containing the corresponding PylRS mutant–tRNACUAPyl pair, GFP(150TAG)His6 and CbzK (2). The dotted line represents a linear regression for the data points. R2 = 0.6611, P < 0.0001. ej, Left, GFP fluorescence from cells containing GFP(150TAG)His6, the indicated PylRS variant–tRNACUAPyl pair, and ncAA (white bar), or the wild-type PylRS–tRNACUAPyl pair with the same ncAA (grey bar). Fluorescence is plotted as a fraction of the signal generated by the wild-type PylRS– tRNACUAPyl pair with 2 mM BocK (1) and GFP(150TAG)His6. Right, ESI–MS analysis of GFP(150X)–His6, where X is the indicated ncAA. e, Found mass: 27,922.0 Da, expected mass 27,923.3 Da. f, Found mass: 27,944.8 Da, expected mass 27,945.5 Da. g, Found mass: 27,867.6 Da, expected mass 27,866.4 Da. h, Found mass: 27,862.0 Da, expected mass 27,861.4 Da. i, Found mass: 27,986.4 Da, expected mass 27,986.2 Da. j, Found mass: 27,945.6 Da, expected mass 27,944.3 Da. Bars represent the mean of three biological replicates, data points are shown as dots, and error bars represent s.d. Mass spectrometry data are from single replicates.
Fig. 5
Fig. 5. tRNA display selection of orthogonal synthetases that charge ncMs.
a, ncMs for which selective PylRS mutants were discovered. b, Fluoro-tREX. A representative gel is shown for each PylRS variant. Experiments performed in independent triplicates with similar results. c, Selected PylRS variants acylate tRNACUAPyl with 13. LC–MS traces (scanning ion mode on 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC) adduct of 13) of AQC-derivative eluted from tRNA pulldowns. Cells harbouring the corresponding PylRS variant and tRNACUAPyl (or only tRNACUAPyl (−)) were grown with 13. Pulldowns used a biotinylated probe against tRNACUAPyl. Representative traces are shown. d,f,h,j,l,n,p, As in b, but with indicated PylRS variants and 14 (d), 15 (f), 16 (h), 17 (j), 18 (l), 19 (n) or 20 (p). e,g,i,k,m,o,q, As in c, but with indicated PylRS variants and 14 (e), 15 (g), 16 (i), 17 (k), 18 (m), 19 (o) or 20 (q). e,g,i, Performed in duplicate, all replicates yielded similar results. c,k,m,o,q, Performed in triplicate, all replicates yielded similar results. r, Fluorescence from cells containing GFP(150TAG)His6, indicated PylRS variant and tRNACUAPyl, and grown in the presence or absence of ncM (1320, 4 mM). Fluorescence is shown as a fraction of the fluorescence generated by the wild-type PylRS–tRNACUAPyl pair with 4 mM BocK (1) and GFP(150TAG)His6. Bar graphs represent mean of three independent measurements, individual data points are shown as dots and error bars indicate s.d. s, ESI–MS of GFP(150(S3mBrF)–His6 purified from cells harbouring PylRS(13_1evol1), tRNACUAPyl and GFP(150TAG)His6 grown with 13. Found mass: 27,939.0 Da, predicted mass: 27,938.2 Da. Spectra acquired once. t, ESI–MS of GFP(150(S)α-Me-pIF)–His6 purified from cells harbouring PylRS(19_1), tRNACUAPyl and GFP(150TAG)His6 grown with 19. Found mass: 28,000.5 Da, predicted mass: 28,000.2 Da. Spectra were acquired once. u, Close up on residue 150 of GFP(150(S3mBrF)–His6, from a crystal structure determined at 1.5 Å. The 2Fo − Fc map is shown at contour level of σ = 2 (Protein Data Bank (PDB) ID 8OVY), electron density (blue).
Extended Data Fig. 1
Extended Data Fig. 1. Encoded cellular incorporation of non-canonical monomers into proteins and into non-canonical polymers requires both tRNA acylation and ribosomal polymerization.
The encoded, site specific, incorporation of a non-canonical monomer (ncM, yellow star) via cellular translation requires both the acylation of an orthogonal tRNA with the ncM by an orthogonal synthetase, and ribosomal polymerization of the ncM into a polymer chain; arrow indicates peptide bond formation between A-site monomer and P-site nascent chain. Current methods for engineering aminoacyl-tRNA synthetases that acylate new monomers rely on translational readouts and therefore require the monomers to be ribosomal substrates. For ncMs that are poor ribosomal substrates this co-dependence creates an evolutionary deadlock in cells; an orthogonal synthetase cannot be evolved to acylate an orthogonal tRNA with ncMs that are poor ribosomal substrates, and ribosomes cannot be evolved to polymerize ncMs that cannot be acylated onto orthogonal tRNAs. To break this deadlock, we develop direct selections for orthogonal synthetases to aminoacylate their cognate orthogonal tRNAs with ncMs, independent of whether the ncMs are ribosomal substrates.
Extended Data Fig. 2
Extended Data Fig. 2. Chemical structures of all compounds used in this study.
(S)-3-amino-6-(((benzyloxy)carbonyl)amino)hexanoic acid ((S)-β3CbzK) (S1), (S)-6-acetamido-3-aminohexanoic acid ((S)-β3AcK) (S2), 6-(((benzyloxy)carbonyl)amino)hexanoic acid (CbzAhx) (S3), 3-amino-2-((1-ethyl-1H-imidazol-5-yl)methyl)propanoic acid (β2NeH (S4)), 3-amino-4-(4-bromophenyl)butanoic acid (β3pBrhF) (S5), 2-benzyl-3-hydroxypropanoic acid (β2OH-F) (S6), 3-amino-3-phenylpropanoic acid (β3F) (S7), (S)-6-((tert-butoxycarbonyl)amino)-2-hydroxyhexanoic acid (OH-BocK) (S8). N6-(tert-butoxycarbonyl)-L-lysine (BocK) (1), N6-((benzyloxy)carbonyl)-L-lysine (CbzK) (2), N6-((prop-2-yn-1-yloxy)carbonyl)-L-lysine (AlkyneK) (3), N6-benzoyl-L-lysine (BenzK) (4), 3-([2,2’-bipyridin]-5-yl)-2-aminopropanoic acid (BiPyA) (5), Nτ-methyl-L-histidine (NτmH) (6), (S)-2-amino-3-(thiophen-3-yl)propanoic acid (3-ThiA) (7), (S)-2-amino-3-(pyridin-3-yl)propanoic acid (PyA) (8), (S)-2-amino-3-(4-iodophenyl)propanoic acid (pIF) (9), (S)-2-amino-3-(4-bromothiophen-2-yl)propanoic acid (BrThiA) (10), (2 S)-2-amino-3-(((2-((1-(6-nitrobenzo[d][1,3]dioxol-5yl)ethyl)thio)ethoxy)carbonyl)amino)propanoic acid (pcDAP) (11), N6-(tert-butoxycarbonyl)-aminohexanoic acid (BocAhx) (12), (S)-3-amino-3-(3-bromophenyl)propanoic acid ((S3mBrF) (13), (S)-3-amino-3-(benzo[d][1,3]dioxol-5-yl)propanoic acid ((S3MDF) (14), (S)-3-amino-3-(4-bromophenyl)propanoic acid ((S3pBrF) (15), (S)-3-amino-3-(3,4-difluorophenyl)propanoic acid ((S3pmFF) (16), (S)-3-amino-3-(2-bromophenyl)propanoic acid ((S3oBrF) (17), (S)-3-amino-3-(3-(trifluoromethyl)phenyl)propanoic acid (18) ((S3mCF3F), (S)-2-amino-3-(4-iodophenyl)-2-methylpropanoic acid (19) ((S)α-Me-pIF), (S)-3-(3-chlorophenyl)-3-hydroxypropanoic acid (20) (OH-(S3mClF).
Extended Data Fig. 3
Extended Data Fig. 3. Relationship between cDNA retrieved from bio-mREX and fluorescence obtained from in vivo genetic code expansion.
Relationship between the acylation signal measured, by bio-mREX, for stmRNAs and the GFP fluorescence signal measured for intact, translation-competent tRNAs. For stmRNAs, active aminoacyl-tRNA synthetases (aaRS) lead to the acylation of their encoding stmRNAs, which by bio-mREX get extended, separated and ultimately reverse transcribed. This results in the cDNA of the active synthetase, which can be quantified by qPCR. In the case of an inactive aaRS the stmRNAs is not acylated and no cDNA is produced in bio-mREX experiments. Therefore, the activity of a synthetase in bio-mREX correlates with the number of cDNA molecules measured by qPCR. In canonical translation an active aaRS enzyme leads to an acylated, intact, cognate tRNACUA which is used in protein translation. Inactive aaRS enzymes lead to non-acylated tRNAs, which are not used in protein translation. The production of GFP protein from GFP(150TAG)His6, as measured by GFP fluorescence, reports on the acylation of tRNACUA, as well as the other steps in the production of protein.
Extended Data Fig. 4
Extended Data Fig. 4. Library design for this study.
PylRS libraries used in this work. a, Overview of the seven libraries designed and created. These libraries target a total of 11 amino acid residues in the PylRS active site and employed several types of degenerate codons. NNK codons are depcited as dark red, DBK codons (+lysine codon) as blue, NDT codons as dark green, NRT codons as yellow. For certain sites, custom residue mixes encompassing the most commonly observed mutations were used (1-7 mixes, depcited as grey spheres). All libraries were created with at least 109 independent transformants. N = A, T, G, C; K = G, T; D = G, A, T; B = G, T, C; R = G, A. The custom mixes are described in the methods. b, The eleven amino acid residues targeted for mutagenesis in the PylRS active site are shown in red. Image was rendered using Pymol, based on the PDB structure 2ZIN.
Extended Data Fig. 5
Extended Data Fig. 5. Assay for determination of acylating monomer identity.
Schematic of tRNA pulldown followed by LC-MS analysis to determine the identity of the monomer on the target tRNA. tRNAs are extracted from cells expressing the tRNA of interest and the cognate orthogonal aaRS, grown in the presence of the ncM. A biotinylated probe is annealed, and the targeted tRNA is pulled down. After washing, the ncM is eluted by alkaline deacylation, derivatised with 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC), and detected using LC-MS.
Extended Data Fig. 6
Extended Data Fig. 6. Detail of the structure of GFP containing a β-amino acid.
Details of structure GFP(150(S)β3mBrF)–His6 (PDB code 8OVY). a, Detail of protein chain at position 150. The acquired structure for GFP(150(S)β3mBrF)–His6 (in yellow) is superimposed with the structure of wt GFP used for refinement (PDB: 2B3P), showing the kink in the backbone induced by the β3-amino acid. b, Detail of hydrogen bond network in the beta-barrel at position 150. The GFP(150(S)β3mBrF)–His6 structure (yellow) is superimposed to the structure of WT GFP (PDB: 2B3P). The kink induced by incorporation of (S)β3mBrF (13) affects the local hydrogen bond at that position, however the remaining contacts of the corresponding beta-strand remain intact.

Similar articles

Cited by

References

    1. Dumas A, Lercher L, Spicer CD, Davis BG. Designing logical codon reassignment—expanding the chemistry in biology. Chem. Sci. 2015;6:50–69. doi: 10.1039/C4SC01534G. - DOI - PMC - PubMed
    1. Robertson WE, et al. Sense codon reassignment enables viral resistance and encoded polymer synthesis. Science. 2021;372:1057–1062. doi: 10.1126/science.abg3029. - DOI - PMC - PubMed
    1. Spinck M, et al. Genetically programmed cell-based synthesis of non-natural peptide and depsipeptide macrocycles. Nat. Chem. 2023;15:61–69. doi: 10.1038/s41557-022-01082-0. - DOI - PMC - PubMed
    1. Santoro SW, Wang L, Herberich B, King DS, Schultz PG. An efficient system for the evolution of aminoacyl–tRNA synthetase specificity. Nat. Biotechnol. 2002;20:1044–1048. doi: 10.1038/nbt742. - DOI - PubMed
    1. Chin JW, et al. An expanded eukaryotic genetic code. Science. 2003;301:964–967. doi: 10.1126/science.1084772. - DOI - PubMed

MeSH terms