Skip to content

StarFlow converts sketches and diagrams into structured workflows using fine-tuned vision–language models and a purpose-built dataset.

License

Notifications You must be signed in to change notification settings

ServiceNow/StarFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StarFlow: Generating Structured Workflow Outputs From Sketch Images

Concept Introduction

StarFlow is based on StarVLM, a framework for training and evaluating vision-language models. StarVLM consists of three categories of components: models, datasets, and pipelines.

Models

Models are divided into local models and API models:

  • Local models are encapsulated as sub-classes of VLLocalModel, and their inputs are encapsulated as sub-classes of VLLocalInput. For example, the Qwen3-VL models (Qwen/Qwen3-VL-8B-Instruct, Qwen/Qwen3-VL-32B-Instruct, etc.) are implemented as QwenModel, and their inputs are implemented as QwenInput.

  • API models are encapsulated as sub-classes of VLAPIModel. For example, the OpenAI-compatible API models (openai/gpt-4o, anthropic/claude-3.7-sonnet, etc.) are implemented as OpenAIModel. The inputs of API models are VLAPIConversation instances, each of which is a sequence of VLAPIMessage instances.

Each local and API model is bound to a config file. For example, local model Qwen/Qwen3-VL-8B-Instruct is bound to config file starvlm/config/model/qwen_3_vl_8b.yaml, and API model openai/gpt-4o is bound to config file starvlm/config/model/gpt_4o.yaml.

Datasets

Datasets are encapsulated as sub-classes of VLDataset. For example, dataset ServiceNow/BigDocs-Sketch2Flow is implemented as BigDocsDataset. Data examples in a dataset are VLExample instances. Each dataset is bound to a config file. For example, dataset ServiceNow/BigDocs-Sketch2Flow is bound to config file starvlm/config/dataset/bigdocs_sketch2flow.yaml.

Pipelines

Three pipelines are implemented:

Environment Setup

Commands

  1. Edit ~/.secret (create it if missing)
export HF_TOKEN=<HF_TOKEN>
export WANDB_API_KEY=<WANDB_API_KEY>
export OPENAI_API_KEY=<OPENAI_API_KEY>
...
  1. Edit ~/.bashrc (create it if missing)
source ~/.secret
# home directory of venvs
export UV_HOME=<UV_HOME>
# root directory of logs
export LOGGING_ROOT=<LOGGING_ROOT>
...
  1. Clone repository and run installers to create venvs
git clone https://github.com/ServiceNow/StarFlow.git
cd StarFlow
# default installer (for API models and most local models)
bash installer/default/install.sh
# phi35 installer (for Phi-3.5 local model)
bash installer/phi35/install.sh
# phi4 installer (for Phi-4 local model)
bash installer/phi4/install.sh
# deepseek installer (for DeepSeek-VL2 local models)
bash installer/deepseek/install.sh
# vllm installer (for vLLM-served API models)
bash installer/vllm/install.sh

Notes

Before conducting experiments, activate the proper venv:

source ${UV_HOME}/starvlm_<installer_name>/bin/activate

Experiment Guide

Commands

  1. Train a local model
torchrun --nproc-per-node 2 starvlm/pipeline/train_fsdp.py --pipeline_name train_fsdp_2 --model_name qwen_3_vl_8b --dataset_names bigdocs_sketch2flow
  1. Evaluate a local model
torchrun --nproc-per-node 2 starvlm/pipeline/evaluate_local.py --pipeline_name evaluate_local --model_name qwen_3_vl_8b --dataset_name bigdocs_sketch2flow
  1. Evaluate a large local model (e.g. Qwen/Qwen3-VL-32B-Instruct)
python starvlm/pipeline/evaluate_local.py --pipeline_name evaluate_local --model_name qwen_3_vl_32b --dataset_name bigdocs_sketch2flow
  1. Evaluate an API model (e.g. openai/gpt-4o)
python starvlm/pipeline/evaluate_api.py --pipeline_name evaluate_api --model_name gpt_4o --dataset_name bigdocs_sketch2flow
  1. Evaluate a vLLM-served API model (e.g. Qwen/Qwen3-VL-8B-Instruct)
vllm serve Qwen/Qwen3-VL-8B-Instruct --max-num-seqs 4 --tensor-parallel-size 2 --dtype bfloat16 --host 0.0.0.0 --port 8000
python starvlm/pipeline/evaluate_api.py --pipeline_name evaluate_api --model_name vllm_qwen_3_vl_8b --dataset_name bigdocs_sketch2flow

Notes

Before running the above commands, properly set the values in the involved config files:

  • pipeline config file: starvlm/config/pipeline/<pipeline_name>.yaml

  • model config file: starvlm/config/model/<model_name>.yaml

  • dataset config file: starvlm/config/dataset/<dataset_name>.yaml

Citation

@article{bechard2025starflow,
  title={StarFlow: Generating Structured Workflow Outputs From Sketch Images},
  author={Bechard, Patrice and Wang, Chao and Abaskohi, Amirhossein and Rodriguez, Juan and Pal, Christopher and Vazquez, David and Gella, Spandana and Rajeswar, Sai and Taslakian, Perouz},
  journal={arXiv preprint arXiv:2503.21889},
  year={2025}
}

About

StarFlow converts sketches and diagrams into structured workflows using fine-tuned vision–language models and a purpose-built dataset.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published