Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 24;115(30):7723-7728.
doi: 10.1073/pnas.1805681115. Epub 2018 Jul 9.

Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations

Affiliations

Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations

Zhana Duren et al. Proc Natl Acad Sci U S A. .

Abstract

When different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this "coupled clustering" problem as an optimization problem and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single-cell RNA-sequencing (RNA-seq) and single-cell ATAC-sequencing (ATAC-seq) data.

Keywords: NMF; coupled clustering; single-cell genomic data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Overview of the coupled-clustering method. (A) Single-cell gene expression and single-cell chromatin accessibility data. (B) Learning coupling matrix from public data. (C) Coupled clustering model. (D) Cluster-specific gene expression and chromatin accessibility.
Fig. 2.
Fig. 2.
(A) Clustering results of k-means, NMF, and our coupled clustering on simulation scRNA-seq data of CMP and MEP. (B) Clustering results of k-means, NMF, and our coupled clustering on simulation scATAC-seq data of CMP and MEP. (C) Comparison of k-means, NMF, and coupled clustering on simulation data of CMP and MEP.
Fig. 3.
Fig. 3.
(A) t-SNE plot of scRNA-seq data (Right) and scATAC-seq data (Left) from RA day 4. Different colors represent clustering assignment from the coupled-clustering method. (B) Same t-SNE plots as in A. Different colors represent cluster-specific TFs’ (Ebf1, Gata4, and Rfx4) gene expression Z score and motif activity Z score. (C) Comparison of cluster-specific TFs’ expression Z score with motif activity Z score at the cluster level. (D) Overlap of cluster-specific peaks nearby genes with cluster-specific genes. The values represent Fisher’s exact test P value and fold change.
Fig. 4.
Fig. 4.
(AC) Similarity of cluster-specific peaks with enhancers of 12 tissues’ seven developmental stages. The numbers represent 10,000× Jaccard index and NA indicates enhancer data of that tissue in that stage are not available. (D) Percentage of VISTA enhancer that overlapped with cluster-specific peaks. (E) GO enrichment of cluster-specific genes.

References

    1. Tang F, et al. mRNA-seq whole-transcriptome analysis of a single cell. Nat Methods. 2009;6:377–382. - PubMed
    1. Buenrostro JD, et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature. 2015;523:486–490. - PMC - PubMed
    1. Smallwood SA, et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat Methods. 2014;11:817–820. - PMC - PubMed
    1. Kiselev VY, et al. SC3: Consensus clustering of single-cell RNA-seq data. Nat Methods. 2017;14:483–486. - PMC - PubMed
    1. Habib N, et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat Methods. 2017;14:955–958. - PMC - PubMed

Publication types