TopoVelo: topological velocity inference from spatial transcriptomic data

Gu, Liu, Lee, Li, Lu, Moline, Guan & Welch. Nature Biotechnology (2025). https://doi.org/10.1038/s41587-025-02688-8 (Welch lab, U Michigan). Extends VeloVAE.

Summary

TopoVelo jointly infers spatial and temporal dynamics of cell-fate transition from spatial transcriptomic data. It extends the RNA-velocity framework by making each cell’s transcription rate ρ a function of its own state and its spatial neighbors’ states, via a graph neural network (GNN) over a tissue graph — the spatial analog of RegVelo’s GRN coupling. Architecturally a graph VAE (VeloVAE lineage). For this wiki TopoVelo is the most physical-time-relevant of the recent crop: it reports cell (migration) velocity in physical units (μm/hour) that matches live-cell-imaging migration rates — but the absolute time scale rests on an assumed 20-hour induction/repression cycle, so its temporal grounding is metric-by-convention plus an external spatial check, not data-driven metric time.

Key Claims

  • Spatially coupled ODE. du_{i,g}/dt = ρ_{i,g}(z_i, {z_j : j∈nbr(i)}) − β_g·u_{i,g}; ds_{i,g}/dt = β_g·u_{i,g} − γ_g·s_{i,g}. Transcription ρ is cell-specific and neighbor-dependent (GNN); β, γ are gene-specific constants shared across cells.
  • Graph VAE. Autoencoding variational Bayes with a GNN (graph attention GAT or graph convolution GCN) encoder/decoder; latent cell state z, cell time t, and a gene on/off phase indicator (Gaussian mixture, Dirichlet prior). Time labels, if available, can be an informative prior.
  • Cell velocity in physical units. Distinct from RNA (expression) velocity, TopoVelo estimates the rate of change of spatial position — cell maturation/migration. Using real cell–cell distances (μm) and the standard assumption that a gene’s induction/repression cycle takes 20 hours, it reports a median cortical cell velocity of ~10 μm/hour, matching live-cell-imaging migration rates (9.8 ± 0.4 μm/h multipolar; 11.3 ± 0.4 μm/h bipolar neurons).
  • Spatial coupling improves inference. Best spatial-time-consistency (Moran’s I), CBDir, k-CBDir, spatial-velocity-consistency and time correlation across four datasets vs scVelo, VeloVAE, STT, DeepVelo, cellDancer, veloVI.
  • Cell influence score = graph-attention weights; identifies influential cells whose spatial positions match ligand–receptor / contact-dependent signaling hotspots (CytoSignal), e.g. radial glia, WNT in embryoid bodies.
  • Biology. Recovers VZ→cortical-plate migration direction; quantitatively annotates mouse neural-tube closure points via divergence of spatial velocity along the A–P axis; maps differentiation direction (inside-out) in human embryoid bodies.

Physical-time grounding (standing lens) — the key entry

  1. Latent time — ordinal or metric? Metric-by-assumption. The inferred latent cell time t is itself ordinal (validated by spatial-time consistency / expected temporal order — rank-type checks, like VeloVAE). It is converted to physical units only by assuming a fixed 20-hour gene induction/repression cycle. So the absolute timescale is imported as a convention, not measured — the classic scale-degeneracy fudge made explicit. What is genuinely novel: combined with real μm spatial coordinates, this yields a migration velocity that an external measurement (live-cell imaging) confirms — the first real-world physical-velocity validation in the wiki.
  2. Rate–time scale degeneracy. Not broken from data. The expression timescale is fixed by the borrowed 20-hour assumption; the spatial scale is real (μm), so the spatial velocity is anchored even though the gene-expression clock is assumed. Honest framing: TopoVelo fixes the scale by convention and cross-checks it against an external spatial measurement, rather than identifying it from snapshot data.
  3. External time anchor. Partial / indirect. Real spatial coordinates (μm) plus validation against live-cell-imaging migration speeds. Not metabolic-labeling, not a real-time series of the same cells. The 20-hour cycle is an assumption, not an anchor — but the migration-rate match is a genuine external consistency check.
  4. Constant-rate assumptions. ρ (transcription) is spatially coupled and cell-specific; β and γ remain gene-specific constants shared across cells (same as RegVelo). Novelty is on the α/ρ axis — context = spatial neighbors instead of a GRN.

TopoVelo is the closest any ingested method comes to physical units, and the way it gets there is instructive for FlowVelo: real space (μm) is measured, absolute time is assumed (20 h), and the product is validated against an independent migration measurement. It does not escape the snapshot scale degeneracy — it makes the assumption explicit and externally checks it. That is arguably the honest best case for “physically interpretable time” without metabolic-labeling. Also note it adds a second velocity — spatial migration — orthogonal to expression velocity.

Key Quotes

“We can predict cell velocity — the rate of change of spatial position with respect to time — which describes the directions and rates of cell differentiation or migration.”

“We followed the standard assumption in previous RNA velocity papers that the induction/repression cycle of a gene takes 20 hours … TopoVelo predicts a median cell velocity of 10 μm per hour within the cortex … this estimate accords with previous measurements of neuron migration rates from live-cell imaging (9.8 ± 0.4 μm per hour for multipolar … 11.3 ± 0.4 μm per hour for bipolar neurons).”

Connections

  • TopoVelo — the method this source defines.
  • spatial-velocity — its core concept (spatially coupled transcription, migration velocity).
  • VeloVAE — direct predecessor (variational mixtures of ODEs); TopoVelo adds space.
  • RegVelo — non-spatial analog: both make transcription context-dependent (neighbors vs GRN).
  • grn-informed-velocity — same “α as a function of context” idea, spatial version.
  • splicing-kinetics-ode — backbone, here spatially coupled.
  • latent-time — ordinal time converted to physical units by the 20 h assumption.
  • physical-time-grounding — the most direct case: metric-by-assumption + external spatial check.
  • metabolic-labeling — the anchor TopoVelo does not use; its spatial route is the alternative.
  • FlowVelo — our work; TopoVelo’s explicit-assumption + external-validation strategy is a model to reckon with.

Contradictions

  • No direct contradiction, but a useful tension with the strict reading on physical-time-grounding that “snapshot data cannot fix absolute time.” TopoVelo does not refute that — it imports the timescale (20 h assumption) rather than identifying it — but it shows that real spatial units + an assumed clock + external migration validation can produce velocities that match physical measurements. This nuances (does not overturn) the standing claim, and is worth foregrounding when positioning FlowVelo.