This is a preprint.
simPIC:flexible simulation of paired-insertion counts for single-cell ATAC sequencing data
- PMID: 41040405
- PMCID: PMC12485805
- DOI: 10.1101/2025.09.21.676689
simPIC:flexible simulation of paired-insertion counts for single-cell ATAC sequencing data
Abstract
Single-cell Assay for Transposase Accessible Chromatin (scATAC-seq) is increasingly used at population scale to study how genetic variation shapes chromatin accessibility across diverse cell types. This widespread adoption of the assay has created a need for computational methods that can handle complex biological and technical variation. Yet method development is limited by the lack of flexible simulation tools with known ground truth. Here, we present simPIC, a simulation framework for generating realistic single-cell ATAC-seq data across individuals and cell types. simPIC supports both population-scale and single-individual simulations, with the ability to model cell groups, batch effects, and genotype-dependent variation in accessibility. These features enable realistic benchmarking for tasks such as chromatin accessibility quantitative trait locus (caQTL) mapping. simPIC generates data that closely match real datasets and better captures inter-individual and experimental variation compared to existing tools.