Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 5;5(1):vbaf185.
doi: 10.1093/bioadv/vbaf185. eCollection 2025.

Marker genes reveal dynamic features of cell evolving processes

Affiliations

Marker genes reveal dynamic features of cell evolving processes

Wenjie Cao et al. Bioinform Adv. .

Abstract

Motivation: Embryonic cells finally evolve into various types of mature cells, where cell fate determinations play pivotal roles, but dynamic features of this process remain elusive.

Results: We analyze four single-cell RNA sequencing datasets on mouse embryo cells, mouse embryonic fibroblasts, human bone marrow, and intestine organoid. We show that key (high expression) genes of each organism exhibit different statistical features and expression patterns before and after branch, e.g. for mouse embryo cells, the mRNA distribution of gene Gata3 is bimodal before branch, unimodal at branching point and trimodal for one branch but bimodal for the other branch. Moreover, there is a distribution mode such that it is the same before and after branch, and this fact would account for maintenance of the genetic information in a complex cell evolving process. Machine learning reveal that along the cell pseudo-time trajectory, the strength that one key gene regulates another is fundamentally increasing before branch but is always monotonically increasing after branch; burst size and frequency of key genes are always monotonically decreasing before branch but monotonically increasing for one branch and monotonically decreasing for another branch. Our results unveil the essential features of dynamic cell processes and can be taken as a supplement for accurately screening marker genes of cell fate determination on basis of the existed methods.

Availability and implementation: The implementation of CFD is available at https://github.com/cellwj/CFD and the preprocessed data is available at https://zenodo.org/records/14367638.Cell fate determination, single-cell RNA sequencing data, marker gene, cell process, developmental branch.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
The framework of three-level model and a brief description of the major steps, which considers the balance and coordination of data driven and model driven, is proposed to reveal the global characteristics and the mechanisms of cell-type dynamics and transcriptional burst dynamics of four different types of datasets.
Figure 2.
Figure 2.
Visualization of clustering and branching trajectories for mouse embryo cells: (a) the clustering result obtained by the UMAP method; and (b) the clustering result for cell evolving stages.
Figure 3.
Figure 3.
Characteristics of the mRNA distributions for key gene Gata3 before and after branch as well as at branching point: (a) before branch; (b) at branching point; (c) for one branch; (d) for the other branch. The blue dashed curves represent the distributions of data points while the red curves represent the fitting results obtained using mechanic models of gene expression.
Figure 4.
Figure 4.
Different modes of the mRNA distributions for key gene Sox2 before and after branch as well as at branching point: (a) before branch; (b) at branching point; (c) for one branch; and (d) for the other branch.
Figure 5.
Figure 5.
Changes in joint probability distribution of two key genes Gata3 and Sox2 before branch (a), at branching point (b) and after branch 1 (c) as well as after branch 2 (d) in the ME dataset.
Figure 6.
Figure 6.
(a, b) Changes of the regulation strength along the pseudo-time trajectory of ME dataset, where (a) key Gene Gata3 regulates key gene Sox2; (b) Sox2 regulates Gata3. (c, d) Wasserstein distances between genes Gata3 and Sox2 at different cell stages, where (c) the Wasserstein distance matrix between Gata3 and Sox2; (d) the Wasserstein distance between Gata3 and Sox2 at different cell stages.
Figure 7.
Figure 7.
Changes in burst size and frequency of key genes Gata3 and Sox2 at different cell stages in the ME dataset: burst size (a) and burst frequency (b) of Gata3; burst size (c), and burst frequency (d) of Sox2.

Similar articles

References

    1. Battich N, Beumer J, De Barbanson B et al. Sequencing metabolically labeled transcripts in single cells reveals mRNA turnover strategies. Science 2020;367:1151–6. - PubMed
    1. Becht E, McInnes L, Healy J et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol 2018;37:38–44. - PubMed
    1. Bergen V, Lange M, Peidli S et al. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat Biotechnol 2020;38:1408–14. - PubMed
    1. Bond ML, Davis ES, Quiroga IY et al. Chromatin loop dynamics during cellular differentiation are associated with changes to both anchor and internal regulatory features. Genome Res 2023;33:1258–68. - PMC - PubMed
    1. Butler A, Hoffman P, Smibert P et al. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 2018;36:411–20. - PMC - PubMed

LinkOut - more resources