veloVI (Gayoso, Weiler et al., Nat Methods 2023)
Gayoso & Weiler (equal), Lotfollahi, Klein, Hong, Streets, Fabian Theis & Nir Yosef. Nature Methods 21:50–59 (2023, online 2023-09-21). UC Berkeley / Helmholtz Munich / TUM / Sanger / CZ Biohub. Source-page grounding for the veloVI entity; predecessor of RegVelo.
Summary
veloVI (velocity variational inference) recasts the scVelo dynamical model as a variational autoencoder: a neural encoder maps each cell’s (unspliced, spliced) into a shared low-dimensional cell representation, and a decoder outputs cell-gene-specific latent time, state-assignment probabilities, and kinetic rates, with the likelihood a function of those. Its headline contribution is transcriptome-wide velocity uncertainty — an intrinsic uncertainty (posterior spread of each velocity vector) and an extrinsic uncertainty (variability among a cell’s neighbours’ predicted future states) — plus a permutation-based score that tells you whether RNA velocity is even appropriate for a given dataset (it worsens model fit under shuffling only where genuine transient dynamics exist). For the wiki, veloVI is the uncertainty- and-applicability branch: it does not anchor physical time, but it is honest about where and whether velocity is trustworthy — a constructive, method-internal echo of velocity-skepticism.
Key Claims
- Variational dynamical model. A VAE over the scVelo per-gene dynamical model; per-cell latent times tied via a low-rank latent variable; trained end-to-end with SVI. Better data fit (MSE) and higher velocity consistency than the EM/steady-state scVelo, and stable across preprocessing pipelines; ~5× faster than the EM model.
- Intrinsic + extrinsic uncertainty. Bayesian posterior over velocity at the cell-gene level → (i) intrinsic = spread of the velocity vector itself; (ii) extrinsic = disagreement among a cell’s neighbours’ future states. Flags directions that warrant caution (e.g. terminal alpha/beta cells with spurious UMAP “backflow”).
- Permutation score → applicability test. For each gene/celltype, shuffle abundances and measure fit degradation; genes/datasets with real transient dynamics worsen, steady-state ones don’t. A systematic way to ask “is this dataset suitable for RNA velocity at all?” — separating positive controls (pancreas, dentate gyrus) from negatives (PBMC, prefrontal cortex, simulated).
- Velocity coherence score. Per-gene agreement between its velocity and the inferred future cell state — helps explain why a particular directionality manifests, and which genes drive it.
- Extensible: time-dependent transcription. Swapping constant α for a monotonic α(t) (rate increases/decreases over time) improves fit for many genes — relaxing the constant-transcription assumption within the same framework.
Physical-time grounding (standing lens)
- Latent time — ordinal or metric? Ordinal. Per-cell latent time relative to “a given maximum time of the process” (as in the EM model); validated against FUCCI cell-cycle score by correlation. No physical units.
- Scale degeneracy. Inherited. Velocities are relative to the process’s maximum time and — unlike steady-state — no longer relative to the splicing rate, but still without an absolute scale. The paper explicitly anticipates ingesting metabolic-labeling data to estimate absolute velocities in future iterations (i.e. it knows the anchor it lacks).
- External time anchor. None (snapshot); flags metabolic-labeling as the future route to absolute velocity.
- Constant-rate assumptions. α, β, γ gene-specific constants by default; the optional time-dependent α(t) extension relaxes constant transcription. β, γ stay constant; genes modeled independently (shared only via the latent representation).
veloVI’s lasting contribution to the wiki’s argument is epistemic, not temporal: it quantifies uncertainty and provides a test of applicability, directly answering the velocity-benchmark-ancheta-2026 / velocity-benchmark-17studies reliability worry from inside a method. For FlowVelo: borrow the uncertainty + permutation-applicability machinery, and note veloVI itself names metabolic labeling as the missing absolute-time anchor.
Key Quotes
“veloVI returns an empirical posterior distribution over velocities … provides a transcriptome-wide quantification of velocity uncertainty.” — Abstract.
“we anticipate including prior information from metabolic labeling data to estimate absolute velocities.” — Discussion.
Connections
- veloVI — the method entity (now paper-grounded).
- scVelo — the dynamical model veloVI reformulates variationally.
- RegVelo — extends veloVI by coupling a GRN; spVelo extends it to spatial/multi-batch.
- Cell2fate / VeloCycle — the Bayesian-uncertainty lineage (VeloCycle adds significance testing).
- FabianTheis — co-corresponding author; the scVelo/veloVI/RegVelo lineage PI.
- velocity-skepticism — veloVI’s uncertainty + permutation score is a constructive, internal answer.
- latent-time / physical-time-grounding — ordinal time relative to process max; no anchor.
- metabolic-labeling — named by the authors as the future absolute-velocity anchor.
- FlowVelo — borrow uncertainty-as-QC + applicability testing.
Contradictions
- None. Grounds the previously web/reference-only veloVI entity with the primary paper; confirms its ordinal-time, uncertainty-forward placement.