Back to feed

simchowitzlabpublic/nano-world-model

simchowitzlabpublic/nano-world-model
254
+148/day
16
Python

A Minimalist, Batteries-included Repository for Advancing World Model Science.

From the README

๐ŸŒ Nano World Model

A minimalist repository for training video world models based on diffusion-forcing.

Key Features

  • ๐Ÿš€ Instant Start โ€” Minimal dependencies, easy data loading. From clone to first rollout in minutes.
  • ๐Ÿ› ๏ธ Unified Pipeline โ€” Training, Validation, Evaluation; All managed with clean hydra-based configuration systems.
  • ๐Ÿ”ฌ Scientific Transparency โ€” Clean codebase with head-to-head ablations across prediction target, action injection, and model scale; Fully open-source, including model checkpoints.
  • ๐Ÿค– Diverse Applications โ€” Long-horizon rollouts, rollout to 3d point clouds, planning (MPC) out of the box.

๐Ÿš€ Quick Start

git clone 
cd nano-world-model
conda env create -f environment.yml && conda activate nanowm

Set data + results paths (or use the gitignored src/configs/local/paths.yaml template โ€” see docs/config_system.md):

export DATASET_DIR=/path/to/dino_wm_data       # DINO-WM envs (point_maze, pusht, ...)
export CSGO_DATA_DIR=/path/to/csgo             # CSGO HDF5 files
export RT1_DATA_ROOT=/path/to/rt1_fractal      # RT-1 LeRobot mirror (optional)
export RESULTS_DIR=/path/to/results            # checkpoints + logs land here

Download the i3d torchscript used by FID/FVD evaluation:

mkdir -p pretrained_models/i3d && curl -L \
    " \
    -o pretrained_models/i3d/i3d_torchscript.pt

For dataset downloads (DINO-WM, RT-1, CSGO), see docs/datasets/README.md.

๐Ÿฅท Train your first model

DINO-WM PushT, NanoWM-B/2, default settings (pred-v ยท additive injection ยท cosine + ZTSNR):

python src/main.py experiment=dino_wm_pusht dataset=dino_wm/pusht model=nanowm_b2

CSGO with the L/2 model:

python src/main.py experiment=csgo dataset=game/csgo model=nanowm_l2_csgo

RT-1 (fractal) main run:

python src/main.py experiment=rt1 dataset=rt1/rt1 model=nanowm_b2

See docs/training.md for the full training guide, design choices, and ablation tables.

๐Ÿ“ฆ Pretrained Checkpoints

Best-config runs (pred-v ยท additive ยท cosine + ZTSNR ยท NanoWM-B/2 unless noted):

| Domain | Checkpoint | Steps | |:-------|:-----------|:------| | DINO-WM Point Maze | ๐Ÿค— nanowm-b2-dino-wm-point-maze-30k | 30k | | DINO-WM Wall | ๐Ÿค— nanowm-b2-dino-wm-wall-15k | 15k | | DINO-WM Rope | ๐Ÿค— nanowm-b2-dino-wm-rope-15k | 15k | | DINO-WM Granular | ๐Ÿค— nanowm-b2-dino-wm-granular-15k | 15k | | DINO-WM PushT | ๐Ÿค— nanowm-b2-dino-wm-pusht-100k | 100k | | RT-1 (fractal) | ๐Ÿค— nanowm-b2-rt1-300k | 300k | | CSGO | ๐Ÿค— nanowm-l2-csgo-100k (NanoWM-L/2) | 100k |

We also ship 11 RT-1 ablation arms (one HF checkpoint per axis ร— method). See [docs/training.md#design-choices](docs/training.md#design-