OmniWM · Specs & docs

Omni World Model

Action-conditioned video world models: the NanoWM diffusion backbone, the Omni joint video+action expert, and the optional Wan2.2-TI2V-5B backbone. These pages are self-contained specs, reports, and design docs. Internal — they embed infra paths and unpublished numbers; don't share the URL publicly.

NanoWM S/B/LOmni action expertWan2.2-TI2V-5BDINO-WMRT-1CSGOLeRobot
3 documents · generated

Evaluation spec — OmniWM

report

Evaluating a trained world model: evaluate_only workflow, sequential vs full_sequence sampling, PSNR/SSIM/LPIPS/FID(+FVD) definitions, and headline numbers on DINO-WM and RT-1 checkpoints.

PushT PSNR 33.19Wall FID 2.64RT-1 FID 35.08
evaluationmetricsfidfvdsampling
updated 2026-07-02 · tohkawa25 · stableOpen →

Omni-Wan-5B backbone spec — OmniWM

design spec

Design spec for adding an optional Wan2.2-TI2V-5B video backbone to OmniWM: OmniWan model class, f0 token surgery, Wan VAE latent pipeline, flow-matching joint loss, prefill-cache policy sampling, and LoRA compute plan. All additive; NanoWM/Omni paths untouched.

backbone 5Btrainable LoRA r32 + action expertlatents 48ch 16x/4x
wan2.2omniworld-modelflow-matchingloradesign
updated 2026-07-02 · tohkawa25 · draftOpen →

Training spec — OmniWM

guide

Training a Nano World Model: workflow, Hydra config system, and the three ablated design axes (prediction target, action injection, model scale) with result tables and shipped checkpoints.

RT-1 FID (B/2) 42.27RT-1 FID (L/2) 36.31default pred-v + additive
traininghydraablationnanowmdiffusion
updated 2026-07-02 · tohkawa25 · stableOpen →