Skip to content

TUM-AVS/physics-wm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PIWM: Enhancing Physical Consistency in Lightweight World Models

🌍 Project Page • 🤓 Paper

comparison


[TL;DR] PIWM is a lightweight, physics‑informed generative model that predicts future images from the current image and actions — enabling forecasting with strong existential and temporal consistency in dynamic environments.

The following implementation is based on DIAMOND.

TODO 📋

  • One-page-paper upload (to be done before [2025/10/30])
  • What-is-it video create (to be done before [2025/10/15])
  • Dataset release
  • Code release
  • Preprint release

Installation

  • Quick start with Miniconda:

    git clone https://github.com/TUM-AVS/physics-wm.git
    cd physics-wm
    conda create -n PIWM python=3.10
    conda activate PIWM
    pip install -r requirements.txt
  • To quickly play with our trained physics‑informed BEV world model (PIWM): the first run will automatically download the pretrained model and several spawn points from the HuggingFace Hub 🤗. Please reserve ~1.55GB of disk space.

    python src/play.py

When the download completes, press Enter to start. Use WASD and Space to control.

top.mp4
  • The default fast config runs best on a CUDA GPU (>12 FPS on an RTX 4080 laptop). The model also runs faster if compiled (>21 FPS on an RTX 4080 Laptop).

    python src/play.py --compile

All demo videos and performance measurements use the fast config referenced by the trainer. You can switch to higher_quality for improved quality at reduced speed.

Training

We collected 2,000 episodes in HighwayEnv with an MCTS agent, yielding 2 million BEV frames with aligned states and actions as the training dataset. We used a random test split of 200 episodes of 1000 steps (specified in test_split.txt), and the remaining 1,800 episodes for training.

To get the data ready for training on your machine:

  • Step 1: Download the preprocessed dataset from the HuggingFace Hub 🤗 and unzip it (~550GB disk space).

    Dataset structure:

    your_path/highway_dataset_processed
    ├── full_res
    └── low_res
  • Step 2: Then edit config/env/piwm.yaml and set:

      `path_data_low_res` to `<your_path>/highway_dataset_processed/low_res`
      `path_data_full_res` to `<your_path>/highway_dataset_processed/full_res`
    

Then you can launch a training run with

python src/main.py

The provided configuration took around 18 hours on an RTX 4090.

Advanced

  • Adjust Soft Mask related weights used during training and inference in piwm.yaml.

    Our released model uses:

    • Training:
      mask_gain_train: 1.0
      
      mask_gain_infer: 0.45       # not used in training
      mask_w_blue: 1.0 
      mask_w_green: 0.8 
    • Inference:
      mask_gain_train: 1.0        # not used in inference
      
      mask_gain_infer: 0.45
      mask_w_blue: 0.55  
      mask_w_green: 0.45        
  • Enable zero-shot Warm Start to inject contextual information at inference time and improve stability for smaller models:

    python src/play.py --warmstart

Citation

@misc{anonymous,
        title={Enhancing Physical Consistency in Lightweight World Models}, 
        author={Anonymous Author(s) for now},
        year={2025},
        eprint={2509.12437},
        archivePrefix={arXiv},
        primaryClass={cs.AI},
        url={https://arxiv.org/abs/2509.12437}, 
  }
}

Credits

About

Official repo for PIWM: Enhancing Physical Consistency in Lightweight World Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •