PIWM: Enhancing Physical Consistency in Lightweight World Models

[TL;DR] PIWM is a lightweight, physics‑informed generative model that predicts future images from the current image and actions — enabling forecasting with strong existential and temporal consistency in dynamic environments.

The following implementation is based on DIAMOND.

TODO 📋

One-page-paper upload (to be done before [2025/10/30])
What-is-it video create (to be done before [2025/10/15])
Dataset release
Code release
Preprint release

Installation

Quick start with Miniconda:

git clone https://github.com/TUM-AVS/physics-wm.git
cd physics-wm
conda create -n PIWM python=3.10
conda activate PIWM
pip install -r requirements.txt

To quickly play with our trained physics‑informed BEV world model (PIWM): the first run will automatically download the pretrained model and several spawn points from the HuggingFace Hub 🤗. Please reserve ~1.55GB of disk space.
```
python src/play.py
```

When the download completes, press Enter to start. Use WASD and Space to control.

top.mp4

The default fast config runs best on a CUDA GPU (>12 FPS on an RTX 4080 laptop). The model also runs faster if compiled (>21 FPS on an RTX 4080 Laptop).
```
python src/play.py --compile
```

All demo videos and performance measurements use the fast config referenced by the trainer. You can switch to higher_quality for improved quality at reduced speed.

Training

We collected 2,000 episodes in HighwayEnv with an MCTS agent, yielding 2 million BEV frames with aligned states and actions as the training dataset. We used a random test split of 200 episodes of 1000 steps (specified in test_split.txt), and the remaining 1,800 episodes for training.

To get the data ready for training on your machine:

Step 1: Download the preprocessed dataset from the HuggingFace Hub 🤗 and unzip it (~550GB disk space).

Dataset structure:
```
your_path/highway_dataset_processed
├── full_res
└── low_res
```

Step 2: Then edit config/env/piwm.yaml and set:

  `path_data_low_res` to `<your_path>/highway_dataset_processed/low_res`
  `path_data_full_res` to `<your_path>/highway_dataset_processed/full_res`

Then you can launch a training run with

python src/main.py

The provided configuration took around 18 hours on an RTX 4090.

Advanced

Adjust Soft Mask related weights used during training and inference in piwm.yaml.

Our released model uses:

Training:

mask_gain_train: 1.0

mask_gain_infer: 0.45       # not used in training
mask_w_blue: 1.0 
mask_w_green: 0.8

Inference:

mask_gain_train: 1.0        # not used in inference

mask_gain_infer: 0.45
mask_w_blue: 0.55  
mask_w_green: 0.45

Enable zero-shot Warm Start to inject contextual information at inference time and improve stability for smaller models:
```
python src/play.py --warmstart
```

Citation

@misc{anonymous,
        title={Enhancing Physical Consistency in Lightweight World Models}, 
        author={Anonymous Author(s) for now},
        year={2025},
        eprint={2509.12437},
        archivePrefix={arXiv},
        primaryClass={cs.AI},
        url={https://arxiv.org/abs/2509.12437}, 
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
config		config
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PIWM: Enhancing Physical Consistency in Lightweight World Models

TODO 📋

Installation

Training

Advanced

Citation

Credits

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

TUM-AVS/physics-wm

Folders and files

Latest commit

History

Repository files navigation

PIWM: Enhancing Physical Consistency in Lightweight World Models

TODO 📋

Installation

Training

Advanced

Citation

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages