GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering (SIGGRAPH 2025)

GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering (SIGGRAPH 2025 Conference Paper)
Zinuo You, Stamatios Georgoulis, Anpei Chen, Siyu Tang, Dengxin Dai

Setup

Clone

git clone https://github.com/sinoyou/gavs.git

Create a python environment

GaVS has been trained and tested with the followings software versions: Python 3.10, CUDA 12.6

create and activate conda environment

conda create -n gavs python=3.10
conda activate gavs

or place local victual environment

conda create -p ./venv python=3.10
conda activate ./venv

Finally, install the required packages in the following order,

pip install -r requirements-torch.txt --extra-index-url https://download.pytorch.org/whl/cu126
pip install ./thirdparty/diff-gaussian-rasterization-w-pose
pip install -r requirements.txt

GaVS Data

In the project root directory, please download and unzip the dataset, pre-trained checkpoints and results from HuggingFace repo.

# download and unzip
python datasets/download_gavs_data.py

Google DeepFused Datasets with Extra Annotations

Our method is mainly evaluated on the selected scenes from Google Deep Online Fused Video Stabilization, including 15 challenging scenes categorized into dynamic, mild and intense. The selection process considers our method's nature of 3D modeling and the balance between different categories.

The original dataset only contains raw unstable videos with gyro information. We first extract video frames and compensate rolling shutter by 2D homography from gyro history. Then we enhance video frames with following modules:

GLOMAP: compute 3D camera poses and sparse point clouds from the static regions.
ProPainter: video completion module to extrapolate original frame with optical flow.
Grounded SAM 2: extract dynamic masks with text prompts.

Additionally, we compute monocular depth maps and optical flows on-the-fly with UniDepth and RAFT.

Directory Layout

With correct cloning and data extraction, the project directory is expected to be as follows:

./
|-- gavs-data 
|   |-- re10k_v2_checkpoints/ # pretrained Flash3D model
|   |-- dataset
|   |   |-- dynamic_dance/    # preprocessed data of 'dynamic dance'
|   |   |-- dynamic_dance.zip # downloaded zip file 
|   |   |-- ...
|   |-- result_and_comparison # reference results from different methods
|-- configs
|   |-- dataset/              # dataset configs
|   |-- experiment/           # finetuning and evaluation configs
|   |-- ...
|-- train.py                  # entry point of finetuning
|-- evaluation.py             # entry point of evaluation
|-- models/
|-- README.md
|-- ...

Finetune and Evaluation

You can start the finetuning with the following command (e.g. dynamic_dance scene). The evaluation will automatically start afterwards,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance

You can also start the evaluation only (e.g. with different stability) with the following command,

python evaluate.py \
       train.mode='eval' \
       hydra.run.dir=./exp/dynamic_dance \
       +experiment=layered_gavs_eval \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       config.eval_dir='customized_eval'

Ablation Configuration

Ablation on dynamic compensation,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       train.handle_dynamic_by_flow=False

Ablation on window regularization,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       loss.window.weight=0

Ablation on inpainting module,

python train.py \
       hydra.run.dir=./exp/dynamic_dance/ \
       +experiment=layered_gavs_overfit \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       dataset.use_inpainted_images=False \
       dataset.pad_border_aug=0

Ablation on video stability (only change evaluation),

python evaluate.py \
       train.mode='eval' \
       hydra.run.dir=./exp/dynamic_dance \
       +experiment=layered_gavs_eval \
       dataset.data_path=./gavs-data/dataset/dynamic_dance \
       config.eval_dir='stability8' \
       dataset.stability=8

Acknowledgement

This repo is built on the Flash3D. We thank the authors of all the open source tools: GLOMAP, ProPainter, Grounded SAM 2, UniDepth and RAFT. We thank the authors of following projects for providing evaluation scenes: Google Deep Online Fused Video Stabilization and LocalRF.

BibTeX

@article{you2025gavs,
    title={GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering},
    author={You, Zinuo and Georgoulis, Stamatios and Chen, Anpei and Tang, Siyu and Dai, Dengxin},
    journal={arXiv preprint arXiv:2506.23957},
    year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering (SIGGRAPH 2025)

Setup

Clone

Create a python environment

GaVS Data

Google DeepFused Datasets with Extra Annotations

Directory Layout

Finetune and Evaluation

Ablation Configuration

Acknowledgement

BibTeX

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
configs		configs
datasets		datasets
evaluation		evaluation
misc		misc
models		models
preprocess		preprocess
thirdparty		thirdparty
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
pyproject.toml		pyproject.toml
requirements-torch.txt		requirements-torch.txt
requirements.txt		requirements.txt
train.py		train.py
trainer.py		trainer.py

License

coding12345671234567/GaVS

Folders and files

Latest commit

History

Repository files navigation

GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering (SIGGRAPH 2025)

Setup

Clone

Create a python environment

GaVS Data

Google DeepFused Datasets with Extra Annotations

Directory Layout

Finetune and Evaluation

Ablation Configuration

Acknowledgement

BibTeX

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages