Programmable gene insertion in human cells with a laboratory-evolved CRISPR-associated transposase

Isaac P Witte^#^{1

2

3}, George D Lampe^#^{4

5}, Simon Eitzinger^#^{1

2

3}, Shannon M Miller^{1

2

3}, Kiara N Berríos^{1

2

3}, Amber N McElroy⁶, Rebeca T King⁴, Olivia G Stringham^{1

2

3}, Diego R Gelsinger⁴, Phuc Leo H Vo⁴, Albert T Chen^{1

2

3}, Jakub Tolar⁶, Mark J Osborn⁶, Samuel H Sternberg^{4

5}, David R Liu^{1

2

3}

Affiliations

¹ Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
² Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.
³ Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
⁴ Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA.
⁵ Howard Hughes Medical Institute, Columbia University, New York, NY, USA.
⁶ Department of Pediatrics, University of Minnesota Medical School, Minneapolis, MN, USA.

^# Contributed equally.

PMID: 40373119
PMCID: PMC12326709
DOI: 10.1126/science.adt5199

Programmable gene insertion in human cells with a laboratory-evolved CRISPR-associated transposase

Isaac P Witte et al. Science. 2025.

. 2025 May 15;388(6748):eadt5199.

doi: 10.1126/science.adt5199. Epub 2025 May 15.

Authors

Affiliations

¹ Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
² Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.
³ Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
⁴ Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA.
⁵ Howard Hughes Medical Institute, Columbia University, New York, NY, USA.
⁶ Department of Pediatrics, University of Minnesota Medical School, Minneapolis, MN, USA.

^# Contributed equally.

PMID: 40373119
PMCID: PMC12326709
DOI: 10.1126/science.adt5199

Abstract

Programmable gene integration in human cells has the potential to enable mutation-agnostic treatments for loss-of-function genetic diseases and facilitate many applications in the life sciences. CRISPR-associated transposases (CASTs) catalyze RNA-guided DNA integration but thus far demonstrate minimal activity in human cells. Using phage-assisted continuous evolution (PACE), we generated CAST variants with >200-fold average improved integration activity. The evolved CAST system (evoCAST) achieves ~10 to 30% integration efficiencies of kilobase-size DNA cargoes in human cells across 14 tested genomic target sites, including safe harbor loci, sites used for immunotherapy, and genes implicated in loss-of-function diseases, with undetected indels and low levels of off-target integration. Collectively, our findings establish a platform for the laboratory evolution of CASTs and advance a versatile system for programmable gene integration in living systems.

PubMed Disclaimer

Conflict of interest statement

Competing interests:

The authors have filed patent applications related to this work. D.R.L. is a co-founder, consultant, and/or equity holder of Beam Therapeutics, Prime Medicine, Pairwise Plants, Chroma Medicine, Resonance Medicine, Exo Therapeutics, and Nvelop Therapeutics. S.H.S. is a co-founder and scientific advisor to Dahlia Biosciences, a scientific advisor to CrisprBits and Prime Medicine, and an equity holder in Dahlia Biosciences and CrisprBits.

Figures

**Fig. 1.. Phage-assisted continuous evolution (PACE) of CRISPR-associated transposases (CASTs).**
**(A)** Overview of RNA-guided DNA integration by Type I-F CAST. DNA targeting is mediated by the CRISPR effector complex Cascade, comprising Cas6, Cas7, Cas8, and a CRISPR RNA (crRNA) complexed with the transposition protein TniQ (together referred to as QCascade). Target DNA-bound QCascade recruits the AAA+ ATPase TnsC, which subsequently recruits the heteromeric TnsA–TnsB transposase to catalyze excision of the transposon DNA and integration of the transposon at the target locus. **(B)** Overview of PACE for CAST evolution. Selection phage (SP) encodes evolving CAST proteins. Host *E. coli* encode a selection circuit that links CAST integration to *gIII* expression, which produces the essential phage protein pIII. Production of pIII enables SPs encoding active CAST proteins to replicate. PACE occurs in a fixed volume vessel (the ‘lagoon’) under constant dilution with fresh host *E. coli*, such that only SPs propagating faster than the rate of dilution can persist and evolve. **(C)** Anatomy of the initial CAST PACE selection circuit. SP encodes evolving transposase proteins TnsA–TnsB (an artificial fusion generated in (44)) and TnsC, while non-evolving CAST components are encoded on a complementary plasmid (CP1). Integration of a transposon provided on a second complementary plasmid (CP2) into a crRNA-specified target site on the accessory plasmid (AP) installs a promoter upstream of *gIII*, resulting in *gIII* expression and SP propagation. Replicating SPs accumulate mutations induced by a mutagenesis plasmid (MP) (48) such that progeny SPs encode new CAST protein variants for selection in subsequent generations.

**Fig. 2.. Continuous evolution of TnsABC.**
**(A)** Summary of TnsABC evolution campaign. Whether evolution segments were conducted using PANCE or PACE is specified, with PANCE passages or PACE hours indicated. Circuit architectures are shown in fig. S2, A to C. **(B)** Overnight phage propagation assays with wild-type (WT) TnsABC SP, pooled evolved SPs from each evolution segment, and *gIII*-expressing phage (positive control for propagation). X-axes indicate host *E. coli* variants encoding circuit 1.0. Host A was used for PANCE N1. Hosts B and C are of increased selection stringency, manipulated by reducing the promoter strength in the transposon on CP2 (Hosts B and C) and reducing the ribosome binding site upstream of *gIII* on the AP (Host C). Host NT A is host A with a non-targeting crRNA. The left graph shows phage propagation levels (output phage titer divided by input titer). The right graph shows transposon integration efficiencies at the AP target site in *E. coli* following overnight propagation, as measured by qPCR. **(C)** Genotypes of a subset of evolved TnsABC variants. Variants N1–1, P1–3, and N2–1 showed the highest integration activity among the variants emerging from their respective PANCE or PACE experiments at two tested genomic sites in HEK293T cells (fig. S6). Variants P2–2, P2–7, and P2–11 are representative of the genotypes that emerged from P2. **(D)** 1-kb transposon integration efficiencies at two genomic loci in HEK293T cells for wild-type (WT) and evolved TnsABC variants specified in (C). (E and F) Assessing the contributions of evolved TnsAB and TnsC subunits to overnight phage propagation levels on P2 host *E. coli* (E) and 1-kb transposon integration efficiency in HEK293T cells (F) for representative P2 CAST variants. Data in (B) and (D–F) are shown as mean±s.d. for n=3 independent biological replicates.

**Fig. 3.. TnsAB- and TnsB-focused evolution generate transposase variants that support robust integration in human cells.**
**(A)** PACE selection circuit 2.0 for TnsAB evolution, which encodes wild-type TnsC on CP1 to limit evolution to TnsAB. The AP size is increased to 10 kb to prevent *gIII* acquisition via AP co-integration or recombination into the SP genome. The transposon left end on CP2 contains a mutated binding site (denoted by an asterisk) for integration host factor (76) to mitigate evolution of potential integration host factor-dependent fitness. **(B)** PACE selection circuit 2.1 for TnsAB evolution, designed to more efficiently select for TnsAB variants that are highly active in human cells. Circuit 2.1 splits the artificial TnsA–TnsB fusion (44) into its native monomeric forms to prevent evolution of the bpNLS linker sequence. CP1 encodes an evolved TnsC variant (N1–5) identified in fig. S16A as enabling the highest integration efficiencies in human cells among all tested TnsC variants. Circuit 2.1 also contains an AP with increased plasmid size (15 kb) to further prevent against *gIII* acquisition, an increased transposon size in CP2 (5 kb) to introduce a new selection stringency, and a crRNA cassette on CP2 instead of CP1 to prevent self-targeting at the crRNA spacer (42). **(C)** PACE selection circuit 3.0 for TnsB evolution, which encodes wild-type TnsA on CP1 to limit evolution to TnsB. **(D)** Summary of TnsAB and TnsB evolution campaigns. Whether evolution segments were conducted in PANCE or PACE is specified, with PANCE passages or PACE hours indicated. **(E)** Genotypes of top-performing evolved TnsB variants. **(F)** 1-kb transposon integration in HEK293T cells at two genomic sites by top-performing TnsB variants. **(G)** Fold-change in integration efficiencies upon co-transfection with a plasmid expressing *E. coli* ClpX. The dotted line represents no change upon ClpX expression. **(H)** Mutated residues in the P4–15 TnsB variant mapped onto an AlphaFold3-predicted structure of a *Pse*TnsB tetramer complexed with a DNA substrate that mimics the product of TnsB transesterification. Each transposon end (green) contains one full TnsB binding site that is joined to the 5′ end of target DNA (blue). Low-confidence unstructured C-termini of TnsB monomers (containing residues with pLDDT < 70) are not shown. The left image shows all mutated P4–15 residues in red, with the catalytic metal-coordinating DDE residues in TnsB₁ and TnsB₃ shown in orange. The upper right image shows the mutated Y349 residue predicted to contact transposon DNA. The bottom right image shows multiple predicted TnsB•TnsB interfaces that contain mutated residues. **(I)** The mutated Q594 residue (red) in the P4–15 TnsB variant mapped onto an AlphaFold3-predicted structure of the *Pse*TnsB C-terminal ‘hook’ domain in complex with a *Pse*TnsC heptamer. Data in (F) and (G) are shown as mean±s.d. for n=3 independent biological replicates.

**Fig. 4.. Development and characterization of evoCAST.**
**(A)** Schematic and genotypes of P4–15 TnsB and evoCAST components. EvoCAST also contains optimized NLS architectures for Cas6, Cas8, and TniQ. **(B)** 1-kb transposon integration efficiencies by evoCAST compared to P4–15 TnsB and wild-type (WT) *Pse*CAST at four genomic sites in HEK293T cells. **(C)** Integration of varying DNA payload sizes (measured as the distance between the 3′ end of the transposon right end and 5′ end of the transposon left end) by WT *Pse*CAST and evoCAST in HEK293T cells. Donor DNA transfected was normalized by mass. **(D)** HTS analysis of the distance between the 3′ end of the target site and 5′ end of the transposon integration site for wild-type (WT) *Pse*CAST and evoCAST across four genomic sites in HEK293T cells. **(E)** Comparison of indel formation across untreated cells, wild-type (WT) *Pse*CAST, and evoCAST at four genomic sites in HEK293T cells. Indels were quantified across a 40-bp window centered at the predicted insertion site for all unintegrated reads (see materials and methods). An unpaired, two-sided t-test was performed to determine statistical significance, with “ns” indicating a p-value > 0.05. **(F)** Relative frequencies of integration in the T-RL or T-LR orientation for evoCAST across four genomic sites in HEK293T cells, determined by ddPCR using probes specific to either T-RL or T-LR integration events. **(G)** Genome-wide integration events for evoCAST (top) and a negative control (bottom) in which only pDonor was transfected, detected via a modified UDiTaS workflow (80). Integration events are measured by the number of unique molecular identifiers (UMIs) identified at a single integration site (see materials and methods). The on-target genomic site (*AAVS1*) is indicated with a red triangle. The dotted line corresponds to a single detected integration event. Shown here is one of two replicates, both replicates are shown in table S2. Data in (B–F) are shown as mean±s.d. for n=3 independent biological replicates.

**Fig. 5.. evoCAST mediates efficient DNA integration at therapeutically relevant endogenous genomic loci in multiple human cell types.**
**(A)** 1-kb transposon integration by wild-type (WT) *Pse*CAST and evoCAST at 14 genomic loci in HEK293T cells. Each locus was targeted using a top-performing crRNA identified in fig. S22, except *HEK3*, which did not undergo crRNA spacer optimization. **(B)** Integration at *AAVS1* in HEK293T cells using 1-kb transposons encoding end sequences engineered to be compatible with in-frame insertion into protein-coding genes. Engineered ends maintain open reading frames (ORFs), compared to the wild-type transposon end which contains stop codons in all three possible translation frames (fig. S23). X-axis denotes the wild-type (WT) transposon end and the transposon end variants. Stop codons in TnsB binding sites were mutated based on previous studies of Tn6677 transposon ends (76) and transposon end sequence conservation for Type I-F CASTs (43). Stop codons outside of TnsB binding sites were mutated to serine, which required a single point mutation and thus was thought to be a less perturbative sequence change. The sequences of the transposon ORF end variants are in table S6. **(C–E)** Schematics depicting evoCAST applications for integrating a F9 cDNA at *ALB* intron 1 (C), a CD19-targeted chimeric antigen receptor (CAR) at the 5′ UTR of *TRAC* (D), and a cDNA encoding a healthy gene copy (Δexon 1) into intron 1 of a gene associated with pathogenic loss-of-function (E). **(F–H)** Integration by wild-type (WT) *Pse*CAST and evoCAST of F9 cDNA into *ALB* intron 1 in HuH7 cells (F), CD19-CAR into the 5′ UTR of *TRAC* in HEK293T cells (G), and wild-type cDNAs (Δexon 1) into intron 1 of their corresponding endogenous locus in HEK293T cells (H). **(I)** 1-kb transposon integration by wild-type (WT) *Pse*CAST and evoCAST at two genomic loci in HeLa and K562 cells. Data in (A), (B), and (F–I) are shown as mean±s.d. for n=3 independent biological replicates.

See this image and copyright information in PMC

References

1. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E, A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012). - PMC - PubMed
1. Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, Marraffini LA, Zhang F, Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013). - PMC - PubMed
1. Mali P, Yang L, Esvelt KM, Aach J, Guell M, DiCarlo JE, Norville JE, Church GM, RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013). - PMC - PubMed
1. Anzalone AV, Koblan LW, Liu DR, Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat Biotechnol 38, 824–844 (2020). - PubMed
1. Komor AC, Kim YB, Packer MS, Zuris JA, Liu DR, Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016). - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Programmable gene insertion in human cells with a laboratory-evolved CRISPR-associated transposase

Affiliations

Programmable gene insertion in human cells with a laboratory-evolved CRISPR-associated transposase

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous