. 2024 May 3;25(1):114.

doi: 10.1186/s13059-024-03255-1.

Kernel-based testing for single-cell differential analysis

A Ozier-Lafontaine¹, C Fourneaux², G Durif², P Arsenteva³, C Vallot^{4

5}, O Gandrillon², S Gonin-Giraud², B Michel^#⁶, F Picard^#⁷

Affiliations

¹ Nantes Université, Centrale Nantes, Laboratoire de Mathématiques Jean Leray, CNRS UMR 6629, F-44000, Nantes, France. anthony.ozier-lafontaine@ec-nantes.fr.
² Laboratory of Biology and Modelling of the Cell, Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, UMR5239, Université Claude Bernard Lyon 1, Lyon, France.
³ Nantes Université, Centrale Nantes, Laboratoire de Mathématiques Jean Leray, CNRS UMR 6629, F-44000, Nantes, France.
⁴ CNRS UMR3244, Institut Curie, PSL University, Paris, France.
⁵ Translational Research Department, Institut Curie, PSL University, Paris, France.
⁶ Nantes Université, Centrale Nantes, Laboratoire de Mathématiques Jean Leray, CNRS UMR 6629, F-44000, Nantes, France. Bertrand.Michel@ec-nantes.fr.
⁷ Laboratory of Biology and Modelling of the Cell, Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, UMR5239, Université Claude Bernard Lyon 1, Lyon, France. franck.picard@ens-lyon.fr.

^# Contributed equally.

PMID: 38702740
PMCID: PMC11069218
DOI: 10.1186/s13059-024-03255-1

Kernel-based testing for single-cell differential analysis

A Ozier-Lafontaine et al. Genome Biol. 2024.

. 2024 May 3;25(1):114.

doi: 10.1186/s13059-024-03255-1.

Authors

A Ozier-Lafontaine¹, C Fourneaux², G Durif², P Arsenteva³, C Vallot^{4

5}, O Gandrillon², S Gonin-Giraud², B Michel^#⁶, F Picard^#⁷

Affiliations

¹ Nantes Université, Centrale Nantes, Laboratoire de Mathématiques Jean Leray, CNRS UMR 6629, F-44000, Nantes, France. anthony.ozier-lafontaine@ec-nantes.fr.
² Laboratory of Biology and Modelling of the Cell, Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, UMR5239, Université Claude Bernard Lyon 1, Lyon, France.
³ Nantes Université, Centrale Nantes, Laboratoire de Mathématiques Jean Leray, CNRS UMR 6629, F-44000, Nantes, France.
⁴ CNRS UMR3244, Institut Curie, PSL University, Paris, France.
⁵ Translational Research Department, Institut Curie, PSL University, Paris, France.
⁶ Nantes Université, Centrale Nantes, Laboratoire de Mathématiques Jean Leray, CNRS UMR 6629, F-44000, Nantes, France. Bertrand.Michel@ec-nantes.fr.
⁷ Laboratory of Biology and Modelling of the Cell, Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, UMR5239, Université Claude Bernard Lyon 1, Lyon, France. franck.picard@ens-lyon.fr.

^# Contributed equally.

PMID: 38702740
PMCID: PMC11069218
DOI: 10.1186/s13059-024-03255-1

Abstract

Single-cell technologies offer insights into molecular feature distributions, but comparing them poses challenges. We propose a kernel-testing framework for non-linear cell-wise distribution comparison, analyzing gene expression and epigenomic modifications. Our method allows feature-wise and global transcriptome/epigenome comparisons, revealing cell population heterogeneities. Using a classifier based on embedding variability, we identify transitions in cell states, overcoming limitations of traditional single-cell analysis. Applied to single-cell ChIP-Seq data, our approach identifies untreated breast cancer cells with an epigenomic profile resembling persister cells. This demonstrates the effectiveness of kernel testing in uncovering subtle population variations that might be missed by other methods.

Keywords: Differential analysis; Kernel methods; Single cell epigenomics; Single cell transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

**Fig. 1**
Top: Examples of distributions of the simulated data, DE, classical difference in expression; DM, difference in modalities; DP, difference in proportions; DB, difference in both modalities and proportions with equal means. Bottom: Projection of cells on the discriminant axis ( $T = 4$ ) for each alternative. The non-linear transform allows the separation of distributions on the discriminant axis

**Fig. 2**
Comparison of DEA methods with respect to type I errors and power. Top: Type I errors are computed on raw p-values under $H_{0}$ . False discovery rate computed on Benjamini-Hochberg adjusted p-values. Power computed on raw p-values under $H_{1}$ . True discovery rate computed on Benjamini-Hochberg adjusted p-values. Simulated data consists of 100 cells, 10000 genes (1000 DE, 9000 non-DE). Alternatives are simulated using DE, classical difference in expression (250 genes); DM, difference in modalities (250 genes); DP, difference in proportions (250 genes); DB, difference in both modalities and proportions with equal means (250 genes). Error rates are computed over 500 replicates. The truncation parameter is set to $T = 4$ for the Gauss-kernel

**Fig. 3**
Top: Hierarchical clustering based on average AUCC scores computed between pairs of methods (over 18 datasets [51]). Bottom: Boxplot of the average expression (left) and proportion of zeros (right) of the top 500 DE genes for different DE methods (over 18 datasets [51]). Red: bulk methods, orange: pseudo-bulk methods, blue: single-cell methods. The truncation parameter is set to $T = 4$ for ktest (only univariate tests were performed)

**Fig. 4**
a Summarized distance graphs between conditions before (left) and after (right) splitting condition 48HREV into populations 48HREV-1 and 48HREV-2. b Cell densities of all compared conditions, before (left) and after (right) splitting condition 48HREV c Cell densities of compared conditions projected on the discriminant axis between conditions 48HREV and 48HDIFF (left), 48HREV and 0H (middle), and 48HREV and 24H (right) with highlighted population 48HREV-1. d Boxplots of the variation of the gene expression along the five populations 0H, 24H, 48HDIFF, 48HREV-1, and 48HREV-2 for the three genes clusters. a, b, c, and d are obtained from scRT-qPCR data. The multivariate differential expression analysis was performed with $T = 10$

**Fig. 5**
Differential analysis of scChIP-Seq data on breast cancer cells. a Cell densities of persister cells vs. untreated cells. Sub-populations of untreated cells were identified using 3-component mixture model, that revealed persister-like cells, intermediate, and naive cells. b–d violin plots of the top-10 differentially enriched H3K27me3 loci between the 3 sub-populations. Features are designated by the genomic coordinates of the ChIP-Seq peaks. Corresponding overlapping genes are provided in Table S3. Multivariate (a) and univariate analyses (**b–d**) were performed with $T = 5$

See this image and copyright information in PMC

References

1. Angelidis I, Simon LM, Fernandez IE, Strunz M, Mayr CH, Greiffo FR, Tsitsiridis G, Ansari M, Graf E, Strom T-M, Nagendran M, Desai T, Eickelberg O, Mann M, Theis FJ, Schiller HB. An atlas of the aging lung mapped by single cell transcriptomics and deep tissue proteomics. Nat Commun. 2019;10(1):963. Number: 1 Publisher: Nature Publishing Group. - PMC - PubMed
1. Bach FR, Lanckriet GRG, Jordan MI. Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04. New York: Association for Computing Machinery; 2004. p. 6
1. Banerjee T, Bhattacharya BB, Mukherjee G. A nearest-neighbor based nonparametric test for viral remodeling in heterogeneous single-cell proteomic data. Ann Appl Stat. 2020;14(4):1777–1805. doi: 10.1214/20-AOAS1362. - DOI
1. Bartosovic M, Kabbe M, Castelo-Branco G. Single-cell CUT &Tag profiles histone modifications and transcription factors in complex tissues. Nat Biotechnol. 2021;39(7):825–835. doi: 10.1038/s41587-021-00869-9. - DOI - PMC - PubMed
1. Benjamini et Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing on JSTOR. 1995.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Kernel-based testing for single-cell differential analysis

Affiliations

Kernel-based testing for single-cell differential analysis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources