Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;27(3):505-517.
doi: 10.1038/s41556-025-01622-z. Epub 2025 Feb 26.

Systematic reconstruction of molecular pathway signatures using scalable single-cell perturbation screens

Affiliations

Systematic reconstruction of molecular pathway signatures using scalable single-cell perturbation screens

Longda Jiang et al. Nat Cell Biol. 2025 Mar.

Abstract

Recent advancements in functional genomics have provided an unprecedented ability to measure diverse molecular modalities, but predicting causal regulatory relationships from observational data remains challenging. Here, we leverage pooled genetic screens and single-cell sequencing (Perturb-seq) to systematically identify the targets of signalling regulators in diverse biological contexts. We demonstrate how Perturb-seq is compatible with recent and commercially available advances in combinatorial indexing and next-generation sequencing, and perform more than 1,500 perturbations split across six cell lines and five biological signalling contexts. We introduce an improved computational framework (Mixscale) to address cellular variation in perturbation efficiency, alongside optimized statistical methods to learn differentially expressed gene lists and conserved molecular signatures. Finally, we demonstrate how our Perturb-seq derived gene lists can be used to precisely infer changes in signalling pathway activation for in vivo and in situ samples. Our work enhances our understanding of signalling regulators and their targets, and lays a computational framework towards the data-driven inference of an 'atlas' of perturbation signatures.

PubMed Disclaimer

Conflict of interest statement

Competing interests: In the past 3 years, R.S. has received compensation from Bristol Myers Squibb, ImmunAI, Resolve Biosciences, Nanostring, 10x Genomics, Parse Biosciences and Neptune Bio. R.S. and H.-H.W. are co-founders and equity holders of Neptune Bio. H.-H.W. has been an employee at Neptune Bio since August 2023. N.I., G.L.-Y. and D.L. are employees and shareholders of Ultima Genomics. E.P. has been an employee at Parse Biosciences since December 2021 and owns stock in the company. The other authors declare no competing interests.

Figures

Figure 1.
Figure 1.
(a). (Top) Experimental workflow for perturbing and stimulating each pathway. (Bottom, left) Schematic showing how Perturb-seq can be used to identify pathway-specific target gene sets. (Bottom, right) Totals for biological samples, perturbations, and cells profiled in this study. (b). Diagram of transcriptome and sgRNA barcoding using Parse Biosciences combinatorial indexing. RNA transcripts and sgRNAs are captured, reverse transcribed, and barcoded by poly(dT) and random hexamer primers so that both modalities share a set of cell barcodes. (c). Eight example single-cell heatmap of one or two perturbations per signaling pathway, showing top up and down regulated differentially expressed genes.
Figure 2.
Figure 2.
(a). Overview of the Mixscale scoring calculation procedure and its application in the weighted differential expression (DE) test. (b). Relationship between the expression levels of the perturbation targets (y-axis) and the inferred perturbation scores (x-axis) across individual cells. Expression is calculated based on pseudobulk expression of 20 bins, which cells are placed into after ordering by Mixscale score (c). Single-cell heatmap for JAK1 perturbation in A549 cells. Cells are ordered by Mixscale score. Even in cells where the JAK1 transcript is not detected, the Mixscale score correlates with the effective perturbation strength. (d). Comparison of false positive rates for the Mixscale weighted DE method (wmvReg) and the unweighted DE test computed under null simulations (Supplementary Methods). (e). Replication rates of identified DE genes across scRNA-seq replicates, when applying different DE methods.
Figure 3.
Figure 3.
(a-d). Comparison of z-scores for top differentially expression genes (DEGs) (y-axis) across different perturbations (x-axis) and across six cell lines. Each dot represents a unique combination of a perturbation, a cell line, and a DEG. The size of the dot represents the magnitude of its z-score produced by the Mixscale weighted DE test. (e). Overview of the MultiCCA decomposition method that extracts correlated perturbations within and across cell lines (Supplementary Methods) (f). Overview of the main regulators in the IFNγ pathway. (g, h). The first two perturbation programs for the IFNγ pathway, returned by MultiCCA decomposition. Each column indicates a combination of either a positive or negative regulator (upper labels) and a cell line (bottom labels), and each row indicates a top DEG from the program signature gene list. Additional perturbation programs are shown in Supplementary Figure 6.
Figure 4.
Figure 4.
(a). Overlap between IFNβ program 1 genes identified by our Perturb-seq experiment and the IFNβ signatures curated by the MSigDB Hallmark collection (Supplementary Methods). (b). Gene set enrichments for a set of DEG from IFNβ-stimulated monocytes (Supplementary Methods), using either Perturb-seq or MSigDB signature lists. Dashed line represents the Bonferroni-corrected threshold for statistical significance. (c). IFNβ module score comparing unstimulated and stimulated Monocytes from an external dataset (Kang et al. 2018 Nat. Biotech) using the Perturb-seq unique gene set (the left panel) or the shared Perturb-seq and MSigDB gene set (the right panel). (d). Overlap of the IFNβ, IFNγ, and TNFβ pathway genes identified by our Perturb-seq experiment. (e). Same as (b), but for pathway-exclusive gene sets. Only the Perturb-seq gene lists correctly identify significance for IFNβ pathway lists. (f-h). Gene set module scores for IFNγ pathway genes, IRF1-associated genes, and IRF1-independent genes calculated in an external dataset that includes IRF1-deficient patients (Supplementary Methods). The IRF1-associated genes and IRF1-independent genes are identified using the IFNγ program 1 and 2 in our Perturb-seq data (Supplementary Methods).
Figure 5.
Figure 5.
(a). Enrichment test for DEG across COVID-19 severity groups from an external dataset (COvid-19 Multi-omics Blood Atlas, COMBAT). Rows represent gene sets from our Perturb-seq data, columns show cell types yielding DEG between disease and healthy cells. Dot size denotes the odds ratio, color intensity indicates the adjusted P-value (Benjamini-Hochberg; * indicates p < 0.01). (b, c). Expression heatmap for the top 30 IFNγ and IFNβ pathway genes (including shared genes). Each column represents pseudobulk expression of CD14 monocytes within each individual (Supplementary Methods). Columns are ordered by increased expression of a combined gene list of IFNγ and IFNβ genes (d, e). Same as (b-c), but for pathway-exclusive gene lists. Only IFNβ-exclusive gene sets are coordinately up-regulated, consistent with the enrichment analysis in (a).
Figure 6.
Figure 6.
(a). Unsupervised transcriptomic clustering of the mouse healing intestine Visium dataset (Parigi et al. 2022 Nat. Comm.). (b). Overlap between the clusters with elevated TGFβ activation and the anatomical regions annotated as exhibiting signs of ‘inflammation and hyperplasia’ based on the pathologist’s analysis in the original study. (c). Enrichment analysis for DEG identified for different clusters in the mouse healing intestine (Supplementary Methods). Rows represent gene sets from our Perturb-seq data, columns show cell types yielding DEGs between disease and healthy cells. Dot size denotes the odds ratio, color intensity indicates the adjusted P-value (Benjamini-Hochberg; * indicates p < 0.01). (d, e). The TGFβ activation scores in the mouse intestine before unrolling using our Perturb-seq TGFβ gene set (d) and the PROGENy TGFβ gene set (e). (f). The TGFβ activation scores in the digitally unrolled mouse intestine using our Perturb-seq TGFβ gene set. The Visium spots shown in (a) are digitally flattened into a proximal to distal direction from left to right on the x-axis (Supplementary Methods).

References

    1. Tang F et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009). - PubMed
    1. Macosko EZ et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell 161, 1202–1214 (2015). - PMC - PubMed
    1. Picelli S et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013). - PubMed
    1. Buenrostro JD, Wu B, Chang HY & Greenleaf WJ ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Curr. Protoc. Mol. Biol 109, 21.29.1–21.29.9 (2015). - PMC - PubMed
    1. Farlik M et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Rep 10, 1386–1397 (2015). - PMC - PubMed

LinkOut - more resources