Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 27;25(3):bbae216.
doi: 10.1093/bib/bbae216.

Data-driven selection of analysis decisions in single-cell RNA-seq trajectory inference

Affiliations

Data-driven selection of analysis decisions in single-cell RNA-seq trajectory inference

Xiaoru Dong et al. Brief Bioinform. .

Abstract

Single-cell RNA sequencing (scRNA-seq) experiments have become instrumental in developmental and differentiation studies, enabling the profiling of cells at a single or multiple time-points to uncover subtle variations in expression profiles reflecting underlying biological processes. Benchmarking studies have compared many of the computational methods used to reconstruct cellular dynamics; however, researchers still encounter challenges in their analysis due to uncertainty with respect to selecting the most appropriate methods and parameters. Even among universal data processing steps used by trajectory inference methods such as feature selection and dimension reduction, trajectory methods' performances are highly dataset-specific. To address these challenges, we developed Escort, a novel framework for evaluating a dataset's suitability for trajectory inference and quantifying trajectory properties influenced by analysis decisions. Escort evaluates the suitability of trajectory analysis and the combined effects of processing choices using trajectory-specific metrics. Escort navigates single-cell trajectory analysis through these data-driven assessments, reducing uncertainty and much of the decision burden inherent to trajectory inference analyses. Escort is implemented in an accessible R package and R/Shiny application, providing researchers with the necessary tools to make informed decisions during trajectory analysis and enabling new insights into dynamic biological processes at single-cell resolution.

Keywords: RNA-seq; pseudotime inference; single cell; trajectory inference.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Analysis choices significantly impact trajectory estimation in scRNA-seq data. For three different choices of selected genes and dimension reduction methods, trajectory inference and pseudotime estimation were performed on an scRNA-seq dataset of hematopoietic stem cells [53]. (A) Dimension-reduced spaces and estimated trajectories with cells colored by cell type. The plot title in each column indicates the dimension reduction used (MDS or UMAP) and the number of highly variable genes selected (300 or 1000). (B) Pseudotime distributions for each set of analysis choices. (C) Normalized gene expression as a function of pseudotime for Cbx1. Cells are colored by pseudotime, i.e., their location along the trajectory. Abbreviations: MDS = multidimensional scaling, UMAP = uniform manifold approximation and projection, HVG = highly variable gene.
Figure 2
Figure 2
Trajectory accuracy is impacted by different dimension reduction algorithms and inclusions of highly variable genes. (A) The performance of different embeddings across all eight simulated scenarios is shown. Embeddings were ranked within each dataset separately for the two metrics. The ranks were scaled so that a lower rank indicated better within-dataset performance. (B) Similar to A using Monocle3.
Figure 3
Figure 3
Overview of ESCORT. Schematic of the Escort workflow. (A) The first step detects the presence of a trajectory signal in the dataset before proceeding to evaluations of embeddings. (B) Various metrics are used to evaluate user-defined embeddings regardless of the ultimate trajectory inference method to be used. (C) In the final step, the preferred trajectory inference method of the user is used to fit a preliminary trajectory to evaluate method-specific hyperparameters. (D) Based on the overall score, embeddings are classified as either recommended or non-recommended.
Figure 4
Figure 4
Trajectory assessment performance of Escort on simulated datasets. (A) The accuracy of trajectories generated on nine different embedding options for each of the eight simulated datasets is shown for different metrics: Kendall rank correlation and mean squared error. Simulated scenarios differ in terms of true trajectory topology (denoted by color) and simulator methods. The y-axis displays the values for the accuracy metric. (B) Each embedding’s Escort score (x-axis) versus the value for each accuracy metric (y-axis) are shown and colored according to their classification by Escort.
Figure 5
Figure 5
Trajectory assessment performance of Escort on public datasets. (A) The accuracy of trajectories generated on nine different embedding options is shown for five publicly available datasets assessed using different metrics: Kendall rank correlation and mean squared error. The colors distinguish each embedding classification by Escort, in addition to those embeddings that failed in the second step. The y-axis displays the values for accuracy metrics. The x-axis corresponds to recommendations generated by Escort. (B) Similar to (A) with the x-axis showing the Escort score.
Figure 6
Figure 6
Analysis of transdifferentiation of hypertrophic chondroblasts using an Escort-guided trajectory. (A) UMAP of the original paper’s embedding and the Monocle3 based trajectory. (B) UMAP of the original paper’s embedding using Slingshot to fit a trajectory. (C) Escort recommended embedding using Slingshot to fit a trajectory. (D) Correlation of pseudotime between the two Lineage A trajectories. (E) Distribution of knots across all significantly dynamic genes for Lineage A. (F) Gene expression as a function of pseudotime for Snorc and Id2. (GI) Similar to (D–F), but for Hapln1 and Pth1r in Lineage B.

Update of

References

    1. Ji Z, Ji H. TSCAN: pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. Nucleic Acids Res 2016;44:e117–7. - PMC - PubMed
    1. Farrell JA, Wang Y, Riesenfeld SJ, et al. . Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 2018;360:eaar3131. - PMC - PubMed
    1. Kim M-C, Borcherding N, Ahmed KK, et al. . CD177 modulates the function and homeostasis of tumor-infiltrating regulatory T cells. Nat Commun 2021;12:5764. - PMC - PubMed
    1. Saelens W, Cannoodt R, Todorov H, Saeys Y. A comparison of single-cell trajectory inference methods. Nat Biotechnol 2019;37:547–54. - PubMed
    1. Trapnell C, Cacchiarelli D, Grimsby J, et al. . The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 2014;32:381–6. - PMC - PubMed

Publication types