Skip to content

Add local MedGemma server and validation improvements#6

Open
ambeckley wants to merge 7 commits intoHealthRex-ARISE:mainfrom
ambeckley:feature/local-medgemma-server
Open

Add local MedGemma server and validation improvements#6
ambeckley wants to merge 7 commits intoHealthRex-ARISE:mainfrom
ambeckley:feature/local-medgemma-server

Conversation

@ambeckley
Copy link

@ambeckley ambeckley commented Feb 5, 2026

Summary

  • Local MedGemma API server (scripts/local_model_server.py): FastAPI server wrapping Hugging Face MedGemma-27b-it for running MAST benchmarks locally. Supports CUDA, Metal (MPS) on Apple Silicon, and CPU. Includes JSON response normalization for SCT (single-object unwrap, code-block extraction, invalid +N number fix) and correct input_ids dtype for embeddings.
  • Setup doc (scripts/LOCAL_MODEL_SETUP.md) for running the server and validating with validate_all.py.
  • Optional dependencies (pyproject.toml): [local-server] extra with fastapi, uvicorn, transformers, torch, accelerate, pillow. Install with pip install -e '.[local-server]'.

ambeckley and others added 7 commits February 5, 2026 15:54
- Add [project.optional-dependencies] local-server with fastapi, uvicorn,
  transformers, torch, accelerate, pillow for scripts/local_model_server.py
- Install with: pip install -e '.[local-server]'

Co-authored-by: Cursor <cursoragent@cursor.com>
- scripts/local_model_server.py: FastAPI server wrapping Hugging Face
  MedGemma-27b-it (google/medgemma-27b-it) with MAST-compatible POST /
  endpoint (Bearer auth, text/plain body, JSON response)
- Supports CUDA, Metal (MPS) on Apple Silicon, and CPU; loads model on
  startup and normalizes JSON (embedding input_ids as long, SCT single-object
  unwrap, code-block and +number JSON extraction)
- scripts/LOCAL_MODEL_SETUP.md: setup and usage for running the local
  server and validating with validate_all.py

Co-authored-by: Cursor <cursoragent@cursor.com>
- Discover both test_*.txt and example_*.txt so SCT benchmark (example_001
  through example_005) is run by validate_all.py
- When a benchmark has no test cases, return a result dict with passed=0,
  failed=0, total_tests=0 to avoid KeyError in main() and print_benchmark_results
- Use results.get('passed', 0) and results.get('failed', 0) so benchmarks
  with no tests do not break the final summary

Co-authored-by: Cursor <cursoragent@cursor.com>
Added SSL verification status output to endpoint info.
Updated model server configuration and endpoints.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant