Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 16;12(1):4940.
doi: 10.1038/s41467-021-25133-1.

LineageOT is a unified framework for lineage tracing and trajectory inference

Affiliations

LineageOT is a unified framework for lineage tracing and trajectory inference

Aden Forrow et al. Nat Commun. .

Abstract

Understanding the genetic and epigenetic programs that control differentiation during development is a fundamental challenge, with broad impacts across biology and medicine. Measurement technologies like single-cell RNA-sequencing and CRISPR-based lineage tracing have opened new windows on these processes, through computational trajectory inference and lineage reconstruction. While these two mathematical problems are deeply related, methods for trajectory inference are not typically designed to leverage information from lineage tracing and vice versa. Here, we present LineageOT, a unified framework for lineage tracing and trajectory inference. Specifically, we leverage mathematical tools from graphical models and optimal transport to reconstruct developmental trajectories from time courses with snapshots of both cell states and lineages. We find that lineage data helps disentangle complex state transitions with increased accuracy using fewer measured time points. Moreover, integrating lineage tracing with trajectory inference in this way could enable accurate reconstruction of developmental pathways that are impossible to recover with state-based methods alone.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Schematic of the LineageOT model and inference procedure.
a A lineage tree embedded in two dimensional gene expression space. As cells change state over time, they trace out paths. Branches in the tree correspond to cell divisions, giving rise to four cells at the measurement time (red circles). Each cell has a barcode to track its lineage. Starting from the ancestral barcode sequence AAAA, mutations are indicated with a red star on the lineage tree and the change to the sequence is shown in red. b Embedded lineage trees from two independent realizations of the developmental process measured at times t1 (blue) and t2 (red). c The setup from (b) is shown in a 3d plot with lineage trees visualized in the vertical dimension. For each time point, we observe cell states (dots) and also the lineage tree, but not the lineage tree embedded in state space. d A purely state-based algorithm would fail to recover the correct trajectories in this example. Green lines connect cells at t2 to their nearest neighbor at t1. Dashed lines indicate erroneous connections. e, f The LineageOT procedure consists of two steps. e Adjust cells at time t2 (purple arrows), based on lineage information. Cells with shared lineage are moved closer together, towards an estimate of ancestral state (solid dots). f Infer a coupling (green lines) connecting the adjusted cells from time t2 (red) to cells from time t1 (blue). This corrects the mistake made in (d).
Fig. 2
Fig. 2. Complex trajectories in C. elegans development.
a UMAP of 81286 C. elegans cells from, using coordinates provided by Packer et al. Color indicates estimated time since fertilization following the colorbar in (b). b In the boxed region from (a), multiple developmental trajectories in the hypodermis converge to the same UMAP coordinates, suggesting a convergence in gene expression.
Fig. 3
Fig. 3. When tested on lineage-labeled C. elegans data, LineageOT outperforms optimal transport with no lineage information.
a Relative accuracy of optimal transport (OT) and LineageOT on the 5123 cells with complete lineage annotations. Errors were normalized by dividing by the error of the noninformative independent coupling. b The error in predicting ancestor states, like the error for predicting descendant states (Fig. S5), is lower for most cells with LineageOT. Here each point represents one cell from the 270 min time point, which was coupled to the 210 min time point. The red line marks equal error for both methods. For each method in both (a, b) and (f, g), we chose the entropy parameters that gave the minimum error from parameter scans like those in (c, d). LineageOT consistently improves on Waddington-OT for reasonable values of the entropy parameter, both in ancestor error (c) and descendant error (d), shown here for the 210–270 min couplings. e UMAP visualization of the cells from the 210 (blue) and 270 min (red) time points. f, g Here, in the same UMAP, cells are colored by the ancestor (f) or descendant (g) error from Waddington-OT minus the same error from LineageOT. Blue indicates better performance by LineageOT, red better performance by Waddington-OT. The cells from 210 min and 270 min in (f) and (g), respectively, are gray, as the corresponding error metric does not apply to them.
Fig. 4
Fig. 4. LineageOT matches the performance of Waddington-OT for simple trajectories and exceeds it for complex trajectories.
ac For a simple bifurcation, optimal transport alone works well and adding lineage information makes little difference. a We simulated a cluster of cells at an early time point splitting into two clusters at a later time point. Green lines connect ancestors in blue to descendants in red in (a, d, g, k). The ancestor errors (b) and descendant errors (c) are similar for optimal transport (OT, orange) and LineageOT (blue) with any entropy parameter, even when LineageOT is given an imperfect tree fitted to simulated barcodes (green). df For a convergent trajectory, LineageOT significantly improves ancestor prediction with no loss of accuracy in descendant prediction, even with an imperfectly fitted lineage tree. d Here we simulated two early clusters that each split; later, two of the resulting clusters merge together. Using LineageOT reduces error substantially for ancestor prediction (e) and slightly for descendant prediction (f). gi The improvement due to lineage information when trajectories converge does not require nearby unconverged clusters. Here we see qualitatively similar improvement for two early clusters whose distributions of descendant cells almost entirely overlap. km With sufficient time between samples, clusters of cells may move closer to early time point cells that are not their ancestors. k In this simulation, after two early clusters each split, two of the late clusters are closer to non-ancestral cells than to their true ancestors. Optimal transport couples clusters incorrectly, leading to high error for predicting both ancestors (l), and descendants (m). LineageOT corrects the errors in this example by averaging with other clusters that are mapped correctly.

References

    1. Klein AM, et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. 2015;161:1187–1201. doi: 10.1016/j.cell.2015.04.044. - DOI - PMC - PubMed
    1. Macosko EZ, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. - DOI - PMC - PubMed
    1. Buenrostro JD, et al. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods. 2013;10:1213–1218. doi: 10.1038/nmeth.2688. - DOI - PMC - PubMed
    1. McKenna, A. et al. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science, 353, aaf7907-1–aaf7907-11 (2016). - PMC - PubMed
    1. Raj B, et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 2018;36:442–450. doi: 10.1038/nbt.4103. - DOI - PMC - PubMed

Publication types