Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 9;5(4):100814.
doi: 10.1016/j.xgen.2025.100814. Epub 2025 Mar 21.

High-throughput screening of human genetic variants by pooled prime editing

Affiliations

High-throughput screening of human genetic variants by pooled prime editing

Michael Herger et al. Cell Genom. .

Abstract

Multiplexed assays of variant effect (MAVEs) enable scalable functional assessment of human genetic variants. However, established MAVEs are limited by exogenous expression of variants or constraints of genome editing. Here, we introduce a pooled prime editing (PE) platform to scalably assay variants in their endogenous context. We first improve efficiency of PE in HAP1 cells, defining optimal prime editing guide RNA (pegRNA) designs and establishing enrichment of edited cells via co-selection. We next demonstrate negative selection screening by testing over 7,500 pegRNAs targeting SMARCB1 and observing depletion of efficiently installed loss-of-function (LoF) variants. We then screen for LoF variants in MLH1 via 6-thioguanine selection, testing 65.3% of all possible SNVs in a 200-bp region including exon 10 and 362 non-coding variants from ClinVar spanning a 60-kb region. The platform's overall accuracy for discriminating pathogenic variants indicates that it will be highly valuable for identifying new variants underlying diverse human phenotypes across large genomic regions.

Keywords: MAVE; MLH1; PE; SMARCB1; functional genomics; genome-editing technology; multiplexed assay of variant effect; precision medicine; prime editing; saturation mutagenesis.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Development of a pooled PE screening platform in HAP1 (A) Schematic of the screening workflow, comprising lentiviral delivery of pegRNAs, puromycin selection, optional enrichment for edited cells via co-selection, functional selection (e.g., essentiality and drug treatment), deep sequencing of lentiviral cassettes, and quantification of pegRNA activities from surrogate targets (STs) and pegRNA frequency changes reflecting selective effects. Function scores for variants are determined from pegRNA frequency changes, using ST editing percentages to filter out inefficient pegRNAs. (B) Schematic of the construct stably integrated into the HAP1 genome for 2A-coupled expression of PEmax, MLH1dn, and the blasticidin S resistance cassette, expressed from a minimal EF-1α promoter. (C) Schematic of the vector for lentiviral delivery of pegRNA libraries. Each vector encodes a pegRNA to install a resistance mutation for co-selection and a pegRNA from the library to engineer a variant to be assayed. Each vector also includes a surrogate target (ST) matching the genomic sequence where the variant is to be introduced and a pegRNA-specific barcode (BC). (D) Enrichment of cells with the PE-mediated T804N edit in exon 17 of ATP1A1 in response to ouabain (obn) treatment, as determined by amplicon sequencing. Percentages of reads with correct PE (T804N) or partial PE (silent mutation only) are shown. (E) Boxplot of ST editing percentages for 973 pegRNAs using different scaffolds. (F) Boxplot of log2 fold changes in ST editing percentages per variant using pegRNAs with the F+E scaffold compared to matched pegRNAs with the original scaffold (data from E). (G) Schematic of SNV and MNV types showing target edit (orange) and additional silent mutation (blue). Boxplot of correct ST editing percentages for the top pegRNA per variant grouped by edit type and PAM disruption (PAMd). (H) Boxplot of log2 fold changes in ST editing percentages comparing MNV- and SNV-pegRNA pairs (data from G). Boxplots: bold line, median; boxes, interquartile range (IQR); whiskers extend to points within 1.5× IQR; outliers not shown in (E). See also Figures S1 and S2; Tables S1 and S2.
Figure 2
Figure 2
Near-saturation mutagenesis of SMARCB1 regions by pooled PE (A) Schematic of the experimental workflow for SMARCB1 essentiality screening using pooled PE, comprising lentiviral delivery of pegRNAs to HAP1:PEmax+MLH1dn cells (D0), and culture until day 34 with sequential harvesting for deep sequencing of pegRNA-ST cassettes and SMARCB1 target regions. (B) Illustration of target regions (light purple) for saturation mutagenesis, centered on SMARCB1 exons 8 and 9. (C) For each genomic position, the average number of pegRNAs per SNV included in the library is plotted. pegRNA coverage per variant was capped at 9. (D) The maximum ST editing rate for each SNV 10 days post transduction is plotted by position. (E) Spearman correlation of correct ST editing percentages with different pegRNA features for a pool of 8,612 pegRNAs programming SNVs, MNVs, and deletions (DELs) across SMARCB1. ST editing percentages were averaged across day 10 duplicates without ouabain before correlation analysis. (F) Violin plots of ST editing percentages observed across the pegRNA pool at multiple time points, with and without ouabain (obn) co-selection. Bold line indicates the median and dashed lines the interquartile ranges. (G) Heatmap of Spearman correlation coefficients determined between surrogate target (ST) and endogenous target (ET) variant frequencies across samples. ST variant frequencies were computed as the fraction of all STs containing a given variant. ET variant frequencies were adjusted with a variant-level background correction based on negative control sample sequencing. (H) Mean log2 fold change in pegRNA frequency (day 34 over day 10) in ouabain-treated samples by variant consequence. Mean values are plotted as a function of ST editing thresholds used for pegRNA filtering. Bands correspond to 95% confidence intervals, and the gray curve shows the number of pegRNAs passing frequency and ST editing thresholds. The dashed vertical line corresponds to 75% correct ST edits, chosen as the threshold for analysis in (I) and (J). pegRNAs with frequencies lower than 6 × 10−5 on day 10 were excluded from analysis. (I) pegRNA log2 fold change between day 34-ouabain and day 10-ouabain samples, grouped by consequence. (J) pegRNA scores from (I), averaged per variant (function scores). Significantly scored variants (q < 0.05) are indicated with stars. Boxplots: bold line, median; boxes, IQR; whiskers extend to points within 1.5× IQR. See also Figures S3–S6; Tables S3 and S4.
Figure 3
Figure 3
Pooled prime editing of MLH1 exon 10 identifies disease-associated variants (A) Experimental workflow of MLH1 6TG selection screens, comprising lentiviral delivery of pegRNAs to HAP1:PEmax and HAP1:PEmax+MLH1dn cells, ouabain co-selection (day 13–day 20), cell harvesting before (day 20) and after (day 34) 6TG challenge, and deep sequencing of pegRNA-ST cassettes and the MLH1 target region. (B) For each condition, AUC measurements for pLoF variant identification are plotted as a function of ST editing threshold. For this analysis, synonymous variants were defined as pNeut and nonsense and canonical splice variants as pLoF. (C) Function scores for 401 variants are plotted by genomic position and colored by variant consequence. Significantly scored variants (q < 0.01) are indicated with stars. (D) The highest function score of all missense variants assayed at each amino acid (AA) position is shown on the MLH1 structure (PDB: 4P7A). (E) Correlation between function scores and CADD scores for 98 missense variants (Pearson’s r = 0.50). (F) Function scores for 79 variants present in ClinVar, grouped by pathogenicity annotation. Boxplot: bold line, median; boxes, IQR; whiskers extend to points within 1.5× IQR. (G) Correlation between function scores (pegRNA-derived) and endogenous function scores for 205 variants (Pearson's r = 0.69). See also Figures S7–S12 and S14; Tables S5 and S6.
Figure 4
Figure 4
Screening all non-coding MLH1 variants in ClinVar for LoF effects (A) A library of 3,748 pegRNAs was designed to install 874 non-coding variants from ClinVar including 771 SNVs, 69 deletions (DELs), 25 insertions (INSs), and 9 MNVs. Variants, represented by blue lines, span the entire MLH1 locus. (B) Function scores are plotted for 357 variants, grouped by pathogenicity annotation in ClinVar. Boxplot: bold line, median; boxes, IQR; whiskers extend to points within 1.5× IQR. (C) Function scores are plotted for 362 variants by genomic position and color-coded by variant consequence. Stars indicate variants for which q < 0.01. (D) The correlation between function scores of SNVs and SpliceAI scores is plotted (n = 296). (E) Detailed results for select regions are shown: exon 1 and upstream (left), exon 11 (middle), and intron 15 (right). Exonic regions are in gray, and variants are colored by consequence (upper panels) or ClinVar annotation (lower panels). See also Figures S8–S10 and S12–S14; Tables S7 and S8.

References

    1. Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J., Grody W.W., Hegde M., Lyon E., Spector E., et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015;17:405–424. - PMC - PubMed
    1. Gray V.E., Hause R.J., Luebeck J., Shendure J., Fowler D.M. Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data. Cell Syst. 2018;6:116–124. - PMC - PubMed
    1. Lappalainen T., MacArthur D.G. From variant to function in human disease genetics. Science. 2021;373:1464–1468. - PubMed
    1. Jaganathan K., Kyriazopoulou Panagiotopoulou S., McRae J.F., Darbandi S.F., Knowles D., Li Y.I., Kosmicki J.A., Arbelaez J., Cui W., Schwartz G.B., et al. Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019;176:535–548. - PubMed
    1. Gao H., Hamp T., Ede J., Schraiber J.G., McRae J., Singer-Berk M., Yang Y., Dietrich A.S.D., Fiziev P.P., Kuderna L.F.K., et al. The landscape of tolerated genetic variation in humans and primates. Science. 2023;380 - PMC - PubMed

MeSH terms

Substances