RNA velocity of single cells (La Manno et al., Nature 2018)

La Manno, Soldatov, Zeisel, Braun, … Sten Linnarsson & Peter V. Kharchenko. Nature 560:494–498 (2018). The founding RNA-velocity paper; tool velocyto (R + Python). Preprocessing reworked with HuYizhou (see xing-hu-regvelo-debate).

Summary

Defines RNA velocity as the time derivative of the gene-expression state, estimated by distinguishing unspliced (nascent) from spliced (mature) mRNA in standard scRNA-seq. Under a simple first-order kinetic model ds/dt = βu − γs, cells in steady state lie on a line u = γs; deviations (unspliced excess during induction, deficit during repression) predict each cell’s near-future state. Crucially for this wiki, the 2018 paper is the physically grounded baseline both critics and authors point back to: velocity is a real time-derivative on a timescale of hours, validated by metabolic / EdU labeling — the anchor later snapshot-only methods (scVelo latent time) dropped (Xing’s “regression” narrative; see velocity-skepticism, physical-time-grounding).

Key Claims

  • The model. ds/dt = βu − γs (β normalized to 1); γ combines degradation + splicing and gene-specific properties (intron/exon lengths, internal priming). Steady state: u = γs.
  • Phase portrait logic. Above the steady-state slope = induction (future ↑), below = repression (future ↓). The unspliced/spliced balance is the predictor.
  • The trick JianhuaXing stresses. du/dt = α − βu is not used in the ds/dt estimate; α is cell-state-dependent (not constant) — the unspliced excess/deficit is the regulation signal. (See RNA velocity origin section.)
  • Physical timescale (hours). “predicts the future state … on a timescale of hours.” EdU pulse-labeling of chromaffin progenitors → extrapolation ≈ 2.5–3.8 h; metabolic labeling shows the spliced/unspliced ratio shifts detectably in 10–100 min. → an external real-time anchor, not snapshot-only.
  • Steady-state estimator. γ fit by regression on extreme expression quantiles (robust when most cells are off steady state); an alternative gene-structure-based fit for far-from- equilibrium genes.
  • Downstream. Velocity embedded in PCA / t-SNE; a Markov random-walk on the field recovers terminal/root states without prior knowledge.
  • Validated. Chromaffin / Schwann-cell-precursor differentiation (lineage tracing PLP1-CreERT2), developing hippocampus (branching lineages), human embryonic forebrain glutamatergic neurogenesis, oligodendrocytes, intestinal epithelium, circadian liver.
  • Stated failure modes. Genes far from equilibrium, uneven non-coding contribution, and alternative splicing → multiple γ — foreshadowing the multiple-rate-kinetics problem later targeted by GraphVelo.

Physical-time grounding (standing lens) — the reference case

  1. Latent time — ordinal or metric? No global latent time is inferred; velocity is a local physical rate (ds/dt), and the extrapolation step is calibrated in hours via labeling. So the origin is metric locally — the most physically grounded entry in the wiki after dynamo. Some analyses still use a differentiation pseudotime (principal curve) for ordering, but the velocity itself carries a real timescale.
  2. Rate–time scale degeneracy. The steady-state estimator fits the ratio γ (β≡1) — so pure snapshot estimation inherits the degeneracy — but EdU/metabolic labeling here pins the absolute extrapolation timescale, partially breaking it for validation.
  3. External time anchor. Yes — EdU pulse-labeling and metabolic labeling calibrate the hours-scale. This is the anchor the field later abandoned.
  4. Constant-rate assumptions. α constant (steady-state assumption); β≡1; γ gene-specific constant. Honest about where this fails (multiple γ via alt-splicing).

This reframes the wiki’s narrative: the temporal grounding problem is partly a regression. The 2018 origin tied velocity to physical time (hours, via labeling); scVelo then made it snapshot-only and relabeled the axis as latent-time. dynamo later re-introduced labeling. FlowVelo’s task is to recover physical-time meaning that the origin already had — ideally without requiring labels (see regvelo-physical-time-critique, FlowVelo).

Figure

velocyto Fig 1 — model and phase portraits

Fig 1 — the model (La Manno et al., Nature 2018; velocyto-2018). (a) Unspliced vs spliced reads separated by intronic content across protocols. (b) The kinetic model (transcription α, splicing β=1, degradation γ; ds/dt = u − γs). (c) Induction/repression dynamics; (d) the phase portrait with the steady-state slope γ — induction above, repression below. (e–h) Validation on the circadian liver time course: unspliced mRNA at each timepoint predicts spliced mRNA at the next, and phase portraits of Fgf1/Cbs trace the 24 h clock — a real-time demonstration that the velocity points the right way in physical time.

Key Quotes

“RNA velocity—the time derivative of the gene expression state—can be directly estimated by distinguishing between unspliced and spliced mRNAs … predicts the future state of individual cells on a timescale of hours.” — Abstract.

Connections

Contradictions

  • Reframes (does not contradict) the wiki. Earlier pages implied dynamo is the lone metric-time case; in fact the 2018 origin already anchored velocity to hours via labeling. The correct framing: physical-time grounding was present at the origin, lost in the snapshot-only latent-time successors, and partially recovered by labeling methods. Updated on physical-time-grounding and metabolic-labeling.