Fundamental errors in RNA velocity from omitting cell growth

Shah, Ming & Cleary (V. Shah & H. Ming equal contribution; corresp. Brian Cleary). bioRxiv 2025.12.18.695252 (2025-12-19). Boston University (Bioinformatics, BME, Computing & Data Science, Biology) + Broad Institute. Full text now in raw/.

Summary

A biophysics critique striking at the velocity equation itself: the standard setup v_i = α_i − γ_i·y_i omits cell growth. In a growing/dividing population, biomass — RNA included — accumulates roughly doubling over the cell cycle, so maintaining a steady concentration requires net positive production: a positive homeostatic velocity v = λx (proportional to abundance, with slope the growth rate λ), contradicting the “velocity = 0 at steady state” interpretation. The paper’s central quantitative result is that when growth is not modeled, the estimated degradation rate absorbs the growth rate: γ* ≈ γ + λ — and although this pushes estimated velocities toward zero, downstream trajectory analysis is sufficiently sensitive to the residual artifacts to produce coherent but wildly misleading directions (estimated velocity fields point orthogonal to the true cell-cycle axis). Crucially, the analysis focuses on metabolic-labeling (4sU pulse-label / pulse-chase) methods — “the more quantitatively viable alternative” — so it indicts the wiki’s supposed metric-time anchor (dynamo / metabolic-labeling) directly, not just snapshot splicing methods. For the wiki this is the sharpest strand of velocity-skepticism: not “the timescale is unidentifiable” but “a first-order physical term is missing from the equation, and it biases direction.”

Key Claims

  • Growth implies a positive homeostatic velocity, v = λx. A growing cell dilutes its molecules; to hold concentration steady, production must exceed degradation, so the homeostatic velocity is positive and proportional to abundance, with slope = growth rate λ. Fast-growing cells (λ=2, doubling t_d=0.5) have higher homeostatic velocity than slow (λ=1) at the same abundance (Fig 1C). Up/down-regulation manifests only as deviations from this line — not from zero.
  • Growth is absorbed into the degradation estimate: γ* ≈ γ + λ. Across genes, growth rates and experimental designs, the estimated degradation rate is almost exactly γ + λ (Fig 3A,B). It is not exact because the dilution effect (fixed cells sampled from a growing population → apparent “decay” of labeled counts) decays like 1/(1+t), not the exponential e^{−t} of true degradation — close to first order, so the mismodeling error is small relative to others.
  • Production α: accurate but reinterpreted. In the noiseless setting α is estimated accurately, but in the RNA-polymerase-limited regime α increases exponentially across the cell cycle, so even a perfectly homeostatic (non-regulated) gene shows ~two-fold variation in production — complicating interpretation.
  • Velocities near zero, trajectories catastrophically wrong. Absorbing γ→γ* pushes velocity estimates near zero (so strong positive/negative velocities can still be read as real regulation), but projecting the estimated vectors and running standard trajectory tools yields coherent diverging trajectories orthogonal to the true (cell-cycle) direction (Fig 3E) — the ground-truth field moves right-to-left along PC1, the estimated field diverges sideways. Small residual bias → qualitatively false biology.
  • 4sU growth defects split pulse-label vs pulse-chase. The nucleotide analog 4sU slows/halts growth during the pulse. Pulse-label γ* then incorporates the slowed (e.g. 80 h doubling) growth, pulse-chase γ* the normal (20 h) growth; applying a 20 h-doubling γ* to 80 h-doubling pulse cells systematically underestimates velocity and flips trajectory direction (Fig 4).
  • Mixed growth-rate populations (e.g. 12 h + 20 h doubling mix) → a single population-average γ* per gene absorbs an average growth rate, mis-fitting both subpopulations.
  • Way forward. Explicitly modeling growth (the v = λx baseline + a growth-aware estimator) both removes the artifacts and turns growth into new biological signal.

Physical-time grounding (standing lens)

A critique aimed below the four axes — at the equation and its gold-standard anchor:

  1. Latent time. Velocity sign/rate is biased by omitted growth, so any latent time or trajectory built on it inherits the bias — and the paper’s headline is precisely a trajectory-direction failure (orthogonal/reversed), the strongest form of this critique.
  2. Scale degeneracy. Reframes it: the fitted degradation is γ* = γ + λ, so the rate is contaminated by growth, a different confound from snapshot timescale non-identifiability. Even with labeling pinning a Δt, the recovered γ is off by λ.
  3. External anchor — the key blow. The analysis targets metabolic labeling (4sU pulse-chase/pulse-label), i.e. the very anchor the wiki credits dynamo with for metric time. The lesson: labeling fixes the timescale but not the growth bias — a measured growth rate λ (or proliferation marker) is a second, independent anchor needed for unbiased rates.
  4. Constant-rate assumptions. Growth λ — nonzero, cell-state-dependent in development — is silently set to zero everywhere; and α is non-constant (exponential over the cell cycle) in the RNAP-limited regime even without regulation.

The sharpest skeptic critique for FlowVelo, and a correction to the wiki’s own narrative: metabolic labeling was treated as the route to metric time (dynamo), but this paper shows labeling-based rates are growth-biased (γ*=γ+λ) and can yield trajectories pointing the wrong way. So “metric time” needs two anchors — a labeling Δt and a growth rate λ. Any physically-grounded velocity (FlowVelo included) should state its treatment of growth/dilution. Pair with GennadyGorin’s normalization critique (counts) and JianhuaXing’s reaction-coordinate critique (time): three orthogonal “missing rigor” axes — counts, growth, time.

Key Quotes

“In a growing population, biomass (including RNA and other macromolecules) is constantly accumulating. This implies a homeostatic velocity … that is positive, which is at odds with the conventional estimation, interpretation, and uses of velocity.” — Abstract.

“downstream trajectory analysis is sufficiently sensitive to remaining artifacts to potentially produce wildly misleading conclusions.” — Results (velocity & trajectories).

Connections

Contradictions

  • Tension with the “scale-only” framing — now sharpened. Other pages frame snapshot velocity as getting direction right but not scale. This paper shows direction itself can be wrong (the orthogonal/reversed trajectories) once growth is omitted — even in labeling methods. Noted on splicing-kinetics-ode, physical-time-grounding, and metabolic-labeling.
  • Reframes the wiki’s metabolic-labeling = metric-time story. Upgraded from abstract-level: the critique is aimed at labeling methods specifically, so dynamo’s “grounded with labeling” verdict needs the caveat “…up to a growth bias of λ.”