. 2025 Apr 9;5(4):100814.

doi: 10.1016/j.xgen.2025.100814. Epub 2025 Mar 21.

High-throughput screening of human genetic variants by pooled prime editing

Michael Herger¹, Christina M Kajba¹, Megan Buckley¹, Ana Cunha², Molly Strom², Gregory M Findlay³

Affiliations

¹ The Genome Function Laboratory, The Francis Crick Institute, London NW1 1AT, UK.
² Viral Vector Core, Human Biology Facility, The Francis Crick Institute, London NW1 1AT, UK.
³ The Genome Function Laboratory, The Francis Crick Institute, London NW1 1AT, UK. Electronic address: greg.findlay@crick.ac.uk.

PMID: 40120586
PMCID: PMC12008803
DOI: 10.1016/j.xgen.2025.100814

High-throughput screening of human genetic variants by pooled prime editing

Michael Herger et al. Cell Genom. 2025.

. 2025 Apr 9;5(4):100814.

doi: 10.1016/j.xgen.2025.100814. Epub 2025 Mar 21.

Authors

Michael Herger¹, Christina M Kajba¹, Megan Buckley¹, Ana Cunha², Molly Strom², Gregory M Findlay³

Affiliations

¹ The Genome Function Laboratory, The Francis Crick Institute, London NW1 1AT, UK.
² Viral Vector Core, Human Biology Facility, The Francis Crick Institute, London NW1 1AT, UK.
³ The Genome Function Laboratory, The Francis Crick Institute, London NW1 1AT, UK. Electronic address: greg.findlay@crick.ac.uk.

PMID: 40120586
PMCID: PMC12008803
DOI: 10.1016/j.xgen.2025.100814

Abstract

Multiplexed assays of variant effect (MAVEs) enable scalable functional assessment of human genetic variants. However, established MAVEs are limited by exogenous expression of variants or constraints of genome editing. Here, we introduce a pooled prime editing (PE) platform to scalably assay variants in their endogenous context. We first improve efficiency of PE in HAP1 cells, defining optimal prime editing guide RNA (pegRNA) designs and establishing enrichment of edited cells via co-selection. We next demonstrate negative selection screening by testing over 7,500 pegRNAs targeting SMARCB1 and observing depletion of efficiently installed loss-of-function (LoF) variants. We then screen for LoF variants in MLH1 via 6-thioguanine selection, testing 65.3% of all possible SNVs in a 200-bp region including exon 10 and 362 non-coding variants from ClinVar spanning a 60-kb region. The platform's overall accuracy for discriminating pathogenic variants indicates that it will be highly valuable for identifying new variants underlying diverse human phenotypes across large genomic regions.

Keywords: MAVE; MLH1; PE; SMARCB1; functional genomics; genome-editing technology; multiplexed assay of variant effect; precision medicine; prime editing; saturation mutagenesis.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

**Figure 1**
Development of a pooled PE screening platform in HAP1 (A) Schematic of the screening workflow, comprising lentiviral delivery of pegRNAs, puromycin selection, optional enrichment for edited cells via co-selection, functional selection (e.g., essentiality and drug treatment), deep sequencing of lentiviral cassettes, and quantification of pegRNA activities from surrogate targets (STs) and pegRNA frequency changes reflecting selective effects. Function scores for variants are determined from pegRNA frequency changes, using ST editing percentages to filter out inefficient pegRNAs. (B) Schematic of the construct stably integrated into the HAP1 genome for 2A-coupled expression of PEmax, MLH1dn, and the blasticidin S resistance cassette, expressed from a minimal EF-1α promoter. (C) Schematic of the vector for lentiviral delivery of pegRNA libraries. Each vector encodes a pegRNA to install a resistance mutation for co-selection and a pegRNA from the library to engineer a variant to be assayed. Each vector also includes a surrogate target (ST) matching the genomic sequence where the variant is to be introduced and a pegRNA-specific barcode (BC). (D) Enrichment of cells with the PE-mediated T804N edit in exon 17 of *ATP1A1* in response to ouabain (obn) treatment, as determined by amplicon sequencing. Percentages of reads with correct PE (T804N) or partial PE (silent mutation only) are shown. (E) Boxplot of ST editing percentages for 973 pegRNAs using different scaffolds. (F) Boxplot of log₂ fold changes in ST editing percentages per variant using pegRNAs with the F+E scaffold compared to matched pegRNAs with the original scaffold (data from E). (G) Schematic of SNV and MNV types showing target edit (orange) and additional silent mutation (blue). Boxplot of correct ST editing percentages for the top pegRNA per variant grouped by edit type and PAM disruption (PAMd). (H) Boxplot of log₂ fold changes in ST editing percentages comparing MNV- and SNV-pegRNA pairs (data from G). Boxplots: bold line, median; boxes, interquartile range (IQR); whiskers extend to points within 1.5× IQR; outliers not shown in (E). See also Figures S1 and S2; Tables S1 and S2.

**Figure 2**
Near-saturation mutagenesis of *SMARCB1* regions by pooled PE (A) Schematic of the experimental workflow for *SMARCB1* essentiality screening using pooled PE, comprising lentiviral delivery of pegRNAs to HAP1:PEmax+MLH1dn cells (D0), and culture until day 34 with sequential harvesting for deep sequencing of pegRNA-ST cassettes and *SMARCB1* target regions. (B) Illustration of target regions (light purple) for saturation mutagenesis, centered on *SMARCB1* exons 8 and 9. (C) For each genomic position, the average number of pegRNAs per SNV included in the library is plotted. pegRNA coverage per variant was capped at 9. (D) The maximum ST editing rate for each SNV 10 days post transduction is plotted by position. (E) Spearman correlation of correct ST editing percentages with different pegRNA features for a pool of 8,612 pegRNAs programming SNVs, MNVs, and deletions (DELs) across *SMARCB1*. ST editing percentages were averaged across day 10 duplicates without ouabain before correlation analysis. (F) Violin plots of ST editing percentages observed across the pegRNA pool at multiple time points, with and without ouabain (obn) co-selection. Bold line indicates the median and dashed lines the interquartile ranges. (G) Heatmap of Spearman correlation coefficients determined between surrogate target (ST) and endogenous target (ET) variant frequencies across samples. ST variant frequencies were computed as the fraction of all STs containing a given variant. ET variant frequencies were adjusted with a variant-level background correction based on negative control sample sequencing. (H) Mean log₂ fold change in pegRNA frequency (day 34 over day 10) in ouabain-treated samples by variant consequence. Mean values are plotted as a function of ST editing thresholds used for pegRNA filtering. Bands correspond to 95% confidence intervals, and the gray curve shows the number of pegRNAs passing frequency and ST editing thresholds. The dashed vertical line corresponds to 75% correct ST edits, chosen as the threshold for analysis in (I) and (J). pegRNAs with frequencies lower than 6 × 10⁻⁵ on day 10 were excluded from analysis. (I) pegRNA log₂ fold change between day 34-ouabain and day 10-ouabain samples, grouped by consequence. (J) pegRNA scores from (I), averaged per variant (function scores). Significantly scored variants (q < 0.05) are indicated with stars. Boxplots: bold line, median; boxes, IQR; whiskers extend to points within 1.5× IQR. See also Figures S3–S6; Tables S3 and S4.

**Figure 3**
Pooled prime editing of *MLH1* exon 10 identifies disease-associated variants (A) Experimental workflow of *MLH1* 6TG selection screens, comprising lentiviral delivery of pegRNAs to HAP1:PEmax and HAP1:PEmax+MLH1dn cells, ouabain co-selection (day 13–day 20), cell harvesting before (day 20) and after (day 34) 6TG challenge, and deep sequencing of pegRNA-ST cassettes and the *MLH1* target region. (B) For each condition, AUC measurements for pLoF variant identification are plotted as a function of ST editing threshold. For this analysis, synonymous variants were defined as pNeut and nonsense and canonical splice variants as pLoF. (C) Function scores for 401 variants are plotted by genomic position and colored by variant consequence. Significantly scored variants (q < 0.01) are indicated with stars. (D) The highest function score of all missense variants assayed at each amino acid (AA) position is shown on the MLH1 structure (PDB: 4P7A). (E) Correlation between function scores and CADD scores for 98 missense variants (Pearson’s r = 0.50). (F) Function scores for 79 variants present in ClinVar, grouped by pathogenicity annotation. Boxplot: bold line, median; boxes, IQR; whiskers extend to points within 1.5× IQR. (G) Correlation between function scores (pegRNA-derived) and endogenous function scores for 205 variants (Pearson's r = 0.69). See also Figures S7–S12 and S14; Tables S5 and S6.

**Figure 4**
Screening all non-coding *MLH1* variants in ClinVar for LoF effects (A) A library of 3,748 pegRNAs was designed to install 874 non-coding variants from ClinVar including 771 SNVs, 69 deletions (DELs), 25 insertions (INSs), and 9 MNVs. Variants, represented by blue lines, span the entire *MLH1* locus. (B) Function scores are plotted for 357 variants, grouped by pathogenicity annotation in ClinVar. Boxplot: bold line, median; boxes, IQR; whiskers extend to points within 1.5× IQR. (C) Function scores are plotted for 362 variants by genomic position and color-coded by variant consequence. Stars indicate variants for which q < 0.01. (D) The correlation between function scores of SNVs and SpliceAI scores is plotted (n = 296). (E) Detailed results for select regions are shown: exon 1 and upstream (left), exon 11 (middle), and intron 15 (right). Exonic regions are in gray, and variants are colored by consequence (upper panels) or ClinVar annotation (lower panels). See also Figures S8–S10 and S12–S14; Tables S7 and S8.

See this image and copyright information in PMC

References

1. Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J., Grody W.W., Hegde M., Lyon E., Spector E., et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015;17:405–424. - PMC - PubMed
1. Gray V.E., Hause R.J., Luebeck J., Shendure J., Fowler D.M. Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data. Cell Syst. 2018;6:116–124. - PMC - PubMed
1. Lappalainen T., MacArthur D.G. From variant to function in human disease genetics. Science. 2021;373:1464–1468. - PubMed
1. Jaganathan K., Kyriazopoulou Panagiotopoulou S., McRae J.F., Darbandi S.F., Knowles D., Li Y.I., Kosmicki J.A., Arbelaez J., Cui W., Schwartz G.B., et al. Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019;176:535–548. - PubMed
1. Gao H., Hamp T., Ede J., Schraiber J.G., McRae J., Singer-Berk M., Yang Y., Dietrich A.S.D., Fiziev P.P., Kuderna L.F.K., et al. The landscape of tolerated genetic variation in humans and primates. Science. 2023;380 - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Research Materials
- Addgene Non-profit plasmid repository
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

High-throughput screening of human genetic variants by pooled prime editing

Affiliations

High-throughput screening of human genetic variants by pooled prime editing

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous