Skip to content

[SIGGRAPH 2025] Official Repo for Paper - GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering

License

Notifications You must be signed in to change notification settings

coding12345671234567/GaVS

 
 

Repository files navigation

GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering (SIGGRAPH 2025)

GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering (SIGGRAPH 2025 Conference Paper)
Zinuo You, Stamatios Georgoulis, Anpei Chen, Siyu Tang, Dengxin Dai

arXiv ProjectPage HuggingFace HuggingFace

Setup

Clone

git clone https://github.com/sinoyou/gavs.git

Create a python environment

GaVS has been trained and tested with the followings software versions: Python 3.10, CUDA 12.6

create and activate conda environment

conda create -n gavs python=3.10
conda activate gavs

or place local victual environment

conda create -p ./venv python=3.10
conda activate ./venv

Finally, install the required packages in the following order,

pip install -r requirements-torch.txt --extra-index-url https://download.pytorch.org/whl/cu126
pip install ./thirdparty/diff-gaussian-rasterization-w-pose
pip install -r requirements.txt

GaVS Data

In the project root directory, please download and unzip the dataset, pre-trained checkpoints and results from HuggingFace repo.

# download and unzip
python datasets/download_gavs_data.py

Google DeepFused Datasets with Extra Annotations

Our method is mainly evaluated on the selected scenes from Google Deep Online Fused Video Stabilization, including 15 challenging scenes categorized into dynamic, mild and intense. The selection process considers our method's nature of 3D modeling and the balance between different categories.

The original dataset only contains raw unstable videos with gyro information. We first extract video frames and compensate rolling shutter by 2D homography from gyro history. Then we enhance video frames with following modules:

  • GLOMAP: compute 3D camera poses and sparse point clouds from the static regions.
  • ProPainter: video completion module to extrapolate original frame with optical flow.
  • Grounded SAM 2: extract dynamic masks with text prompts.

Additionally, we compute monocular depth maps and optical flows on-the-fly with UniDepth and RAFT.

Directory Layout

With correct cloning and data extraction, the project directory is expected to be as follows:

./
|-- gavs-data 
|   |-- re10k_v2_checkpoints/ # pretrained Flash3D model
|   |-- dataset
|   |   |-- dynamic_dance/    # preprocessed data of 'dynamic dance'
|   |   |-- dynamic_dance.zip # downloaded zip file 
|   |   |-- ...
|   |-- result_and_comparison # reference results from different methods
|-- configs
|   |-- dataset/              # dataset configs
|   |-- experiment/           # finetuning and evaluation configs
|   |-- ...
|-- train.py                  # entry point of finetuning
|-- evaluation.py             # entry point of evaluation
|-- models/
|-- README.md
|-- ...

Finetune and Evaluation

You can start the finetuning with the following command (e.g. dynamic_dance scene). The evaluation will automatically start afterwards,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance

You can also start the evaluation only (e.g. with different stability) with the following command,

python evaluate.py \
       train.mode='eval' \
       hydra.run.dir=./exp/dynamic_dance \
       +experiment=layered_gavs_eval \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       config.eval_dir='customized_eval'

Ablation Configuration

Ablation on dynamic compensation,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       train.handle_dynamic_by_flow=False

Ablation on window regularization,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       loss.window.weight=0

Ablation on inpainting module,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       dataset.use_inpainted_images=False \
       dataset.pad_border_aug=0

Ablation on video stability (only change evaluation),

python evaluate.py \
       train.mode='eval' \
       hydra.run.dir=./exp/dynamic_dance \
       +experiment=layered_gavs_eval \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       config.eval_dir='stability8' \
       dataset.stability=8

Acknowledgement

This repo is built on the Flash3D. We thank the authors of all the open source tools: GLOMAP, ProPainter, Grounded SAM 2, UniDepth and RAFT. We thank the authors of following projects for providing evaluation scenes: Google Deep Online Fused Video Stabilization and LocalRF.

BibTeX

@article{you2025gavs,
    title={GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering},
    author={You, Zinuo and Georgoulis, Stamatios and Chen, Anpei and Tang, Siyu and Dai, Dengxin},
    journal={arXiv preprint arXiv:2506.23957},
    year={2025}
}

About

[SIGGRAPH 2025] Official Repo for Paper - GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%