simchowitzlabpublic/nano-world-model

373

+20/day

PythonAI/ML

A Minimalist, Batteries-included Repository for Advancing World Model Science.

From the README

🌍 Nano World Model

A minimalist repository for training video world models based on diffusion-forcing.

Key Features

🚀 Instant Start — Minimal dependencies, easy data loading. From clone to first rollout in minutes.
🛠️ Unified Pipeline — Training, Validation, Evaluation; All managed with clean hydra-based configuration systems.
🔬 Scientific Transparency — Clean codebase with head-to-head ablations across prediction target, action injection, and model scale; Fully open-source, including model checkpoints.
🤖 Diverse Applications — Long-horizon rollouts, rollout to 3d point clouds, planning (MPC) out of the box.

🚀 Quick Start

git clone 
cd nano-world-model
conda env create -f environment.yml && conda activate nanowm

Set data + results paths (or use the gitignored src/configs/local/paths.yaml template — see docs/config_system.md):

export DATASET_DIR=/path/to/dino_wm_data       # DINO-WM envs (point_maze, pusht, ...)
export CSGO_DATA_DIR=/path/to/csgo             # CSGO HDF5 files
export RT1_DATA_ROOT=/path/to/rt1_fractal      # RT-1 LeRobot mirror (optional)
export RESULTS_DIR=/path/to/results            # checkpoints + logs land here

Download the i3d torchscript used by FID/FVD evaluation:

mkdir -p pretrained_models/i3d && curl -L \
    " \
    -o pretrained_models/i3d/i3d_torchscript.pt

For dataset downloads (DINO-WM, RT-1, CSGO), see docs/datasets/README.md.

🥷 Train your first model

DINO-WM PushT, NanoWM-B/2, default settings (pred-v · additive injection · cosine + ZTSNR):

python src/main.py experiment=dino_wm_pusht dataset=dino_wm/pusht model=nanowm_b2

CSGO with the L/2 model:

python src/main.py experiment=csgo dataset=game/csgo model=nanowm_l2_csgo

RT-1 (fractal) main run:

python src/main.py experiment=rt1 dataset=rt1/rt1 model=nanowm_b2

See docs/training.md for the full training guide, design choices, and ablation tables.

📦 Pretrained Checkpoints

Best-config runs (pred-v · additive · cosine + ZTSNR · NanoWM-B/2 unless noted):

| Domain | Checkpoint | Steps | |:-------|:-----------|:------| | DINO-WM Point Maze | 🤗 nanowm-b2-dino-wm-point-maze-30k | 30k | | DINO-WM Wall | 🤗 nanowm-b2-dino-wm-wall-15k | 15k | | DINO-WM Rope | 🤗 nanowm-b2-dino-wm-rope-15k | 15k | | DINO-WM Granular | 🤗 nanowm-b2-dino-wm-granular-15k | 15k | | DINO-WM PushT | 🤗 nanowm-b2-dino-wm-pusht-100k | 100k | | RT-1 (fractal) | 🤗 nanowm-b2-rt1-300k | 300k | | CSGO | 🤗 nanowm-l2-csgo-100k (NanoWM-L/2) | 100k |

We also ship 11 RT-1 ablation arms (one HF checkpoint per axis × method). See [docs/training.md#design-choices](docs/training.md#design-

View on GitHub