Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 22:2021:9919080.
doi: 10.1155/2021/9919080. eCollection 2021.

Cell Heterogeneity Analysis in Single-Cell RNA-seq Data Using Mixture Exponential Graph and Markov Random Field Model

Affiliations

Cell Heterogeneity Analysis in Single-Cell RNA-seq Data Using Mixture Exponential Graph and Markov Random Field Model

Yishu Wang et al. Biomed Res Int. .

Abstract

Advanced single-cell profiling technologies promote exploration of cell heterogeneity, and clustering of single-cell RNA (scRNA-seq) data enables discovery of coexpression genes and network relationships between genes. In particular, single-cell profiling of circulating tumor cells (CTCs) can provide unique insights into tumor heterogeneity (including in triple-negative breast cancer (TNBC)), while scRNA-seq leads to better understanding of subclonal architecture and biological function. Despite numerous reports suggesting a direct correlation between circulating tumor cells (CTCs) and poor clinical outcomes, few studies have provided a thorough heterogeneity characterization of CTCs. In addition, TNBC is a disease with not only intertumor but also intratumor heterogeneity and represents various biological distinct subgroups that may have relationships with immune functions that are not clearly established yet. In this article, we introduce a new scheme for detecting genotypic characterization of single-cell heterogeneities and apply it to CTC and TNBC single-cell RNA-seq data. First, we use an existing mixture exponential family graph model to partition the cell-cell network; then, with the Markov random field model, we obtain more flexible network rewiring. Finally, we find the cell heterogeneity and network relationships according to different high coexpression gene modules in different cell subsets. Our results demonstrate that this scheme provides a reasonable and effective way to model different cell clusters and different biological enrichment gene clusters. Thus, using different internal coexpression genes of different cell clusters, we can infer the differences in tumor composition and diversity.

PubMed Disclaimer

Conflict of interest statement

The authors confirm that there are no conflicts of interest.

Figures

Figure 1
Figure 1
Clustering results with three groups of synthetic dataset by (a) MixtureERGM and (b) role analysis, where the original groups are denoted by the different color circles and grouping results by algorithm are denoted by the different color nodes.
Figure 2
Figure 2
The plot of ICL of MixtureERGM algorithm against number of clusters for CTC dataset.
Figure 3
Figure 3
(a) Gene expression profiles of circulating tumor cells were clustered using MixtureERGM algorithm with 5 underlying clusters. Each column represents one cell. (b) GO enrichment results of high coexpression genes in these three cell clusters generated by MixtureERGM algorithm. P values are denoted by the color bars.
Figure 4
Figure 4
Coexpression gene modules in cell cluster 1 (A), cell cluster 3 (B), and cell cluster 5 (C). Yellow bars indicate the negative log of P values.
Figure 5
Figure 5
Hub nodes were generated by MRF algorithm, green nodes denote cell cluster1, red nodes denote cell cluster 3, and light blue nodes denote cell cluster 5. (a) Negative log of P values when calculating the significance of GO enrichment functions of high-expression genes in these hub cells are indicated by yellow bars.
Figure 6
Figure 6
(a) Gene network of differentially expressed genes between cell clusters 3 and 5 and hub node genes. (b) Gene network of differentially expressed genes between cell clusters 1 and 3 and hub node genes. (c) Gene network of differentially expressed genes between cluster 1 and cluster 5.
Figure 7
Figure 7
(a) Protein fully connected interaction network. Different colors denote different protein modules. (b) Enriched protein clusters in the protein-protein network translated by these differentially expressed genes.
Figure 8
Figure 8
Single-cell trajectory results. Different color nodes represent different cells.
Figure 9
Figure 9
Genes “KRT19” and “KRT8” expressing pattern in different cells.
Figure 10
Figure 10
Enrichment function items of genes “JUNB,” “DUSP1,” “FOS,” “EGR1,” “KRT19,” “KRT8,” and “SPARC.” (a) Cell category in which genes high expressed, (b) comparison with the information on human disease-associated genes, and (c) comparison with COVID gene functional dataset.

Similar articles

Cited by

References

    1. Wang Y., Fang H., Yang D., Zhao H., Deng M. Network clustering analysis using mixture exponential-family random graph models and its application in genetic interaction data. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2019;16(5):1743–1752. doi: 10.1109/tcbb.2017.2743711. - DOI - PubMed
    1. Golub T. R., Slonim D. K., Tamayo P., et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531–537. doi: 10.1126/science.286.5439.531. - DOI - PubMed
    1. Park S. Y., G’onen M., Kim H. J., Michor F., Polyak K. Cellular and genetic diversity in the progression of in situ human breast carcinomas to an invasive phenotype. The Journal of clinical investigation. 2010;120(2):636–644. doi: 10.1172/JCI40724. - DOI - PMC - PubMed
    1. Ramsköld D., Luo S., Wang Y.-C., et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nature Biotechnology. 2012;30(8):777–782. doi: 10.1038/nbt.2282. - DOI - PMC - PubMed
    1. Cann G. M., Gulzar Z. G., Cooper S., et al. mRNA-Seq of single prostate cancer circulating tumor cells reveals recapitulation of gene expression and pathways found in prostate cancer. PLoS ONE. 2012;7(11):p. e49144. doi: 10.1371/journal.pone.0049144. - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources