Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jun 12:18:1429-1439.
doi: 10.1016/j.csbj.2020.06.012. eCollection 2020.

Single-cell ATAC sequencing analysis: From data preprocessing to hypothesis generation

Affiliations
Review

Single-cell ATAC sequencing analysis: From data preprocessing to hypothesis generation

Seungbyn Baek et al. Comput Struct Biotechnol J. .

Abstract

Most genetic variations associated with human complex traits are located in non-coding genomic regions. Therefore, understanding the genotype-to-phenotype axis requires a comprehensive catalog of functional non-coding genomic elements, most of which are involved in epigenetic regulation of gene expression. Genome-wide maps of open chromatin regions can facilitate functional analysis of cis- and trans-regulatory elements via their connections with trait-associated sequence variants. Currently, Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq) is considered the most accessible and cost-effective strategy for genome-wide profiling of chromatin accessibility. Single-cell ATAC-seq (scATAC-seq) technology has also been developed to study cell type-specific chromatin accessibility in tissue samples containing a heterogeneous cellular population. However, due to the intrinsic nature of scATAC-seq data, which are highly noisy and sparse, accurate extraction of biological signals and devising effective biological hypothesis are difficult. To overcome such limitations in scATAC-seq data analysis, new methods and software tools have been developed over the past few years. Nevertheless, there is no consensus for the best practice of scATAC-seq data analysis yet. In this review, we discuss scATAC-seq technology and data analysis methods, ranging from preprocessing to downstream analysis, along with an up-to-date list of published studies that involved the application of this method. We expect this review will provide a guideline for successful data generation and analysis methods using appropriate software tools and databases for the study of chromatin accessibility at single-cell resolution.

Keywords: ATAC sequencing; Chromatin accessibility; Single-cell ATAC sequencing; Single-cell RNA sequencing; Single-cell biology.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
Schematic overview of a typical single-cell ATAC sequencing analysis workflow.
Fig. 2
Fig. 2
Schematic summary of two major strategies for single-cell adaptation of ATAC sequencing library generation: (a) split-pool cellular indexing and (b) microfluidics-based, and (c) their modified methods.
Fig. 3
Fig. 3
Integration of single-cell ATAC sequencing data with single-cell RNA sequencing data via experimental approaches and computational approaches. Integrative analysis of gene expression and chromatin accessibility for the same cell types can be used for confirming cell identity annotation and for facilitating generating new hypotheses for regulatory elements. For example, identification of peak-to-gene interactions can infer enhancer-promoter interactions; comparison between expression of a gene and accessibility of its TF-enriched regions across pseudotime can reveal kinetic relationship between transcription and regulatory regions; comparison between expression of a gene and accessibility of its TF-enriched regions across cell types or sample groups can reveal expression and accessibility signature associated with a cell type or subpopulation.

References

    1. Buenrostro J.D., Giresi P.G., Zaba L.C., Chang H.Y., Greenleaf W.J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods. 2013;10:1213–1218. - PMC - PubMed
    1. Valouev A., Johnson D.S., Sundquist A., Medina C., Anton E., Batzoglou S. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods. 2008;5:829–834. - PMC - PubMed
    1. Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. - PMC - PubMed
    1. Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. - PMC - PubMed
    1. Chen H., Lareau C., Andreani T., Vinyard M.E., Garcia S.P., Clement K. Assessment of computational methods for the analysis of single-cell ATAC-seq data. Genome Biol. 2019;20:241. - PMC - PubMed