Prov-gigapath#6
Conversation
Co-authored-by: Copilot <copilot@github.com>
📝 WalkthroughWalkthroughAdds a new ProvGigaPath Ray Serve FastAPI deployment with typed config, model loading, batched inference, and LZ4 I/O; registers it in Helm values. Increases max_queued_requests for SemanticSegmentation, HeatmapBuilder, BinaryClassifier, and Virchow2. ChangesProvGigaPath Deployment
Queue Capacity Tuning
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces the prov-gigapath model service, including its Helm configuration and implementation using Ray Serve and FastAPI. It also increases the max_queued_requests for several existing deployments to improve request handling capacity. Feedback was provided to offload CPU-bound operations, such as image transformations and data compression, to separate threads using asyncio.to_thread to prevent blocking the event loop.
There was a problem hiding this comment.
Pull request overview
Adds a new Ray Serve application for the Prov-GigaPath tile encoder and updates Ray Serve queue sizing across several deployments to allow higher request buffering.
Changes:
- Introduces
models/prov_gigapath.py, a FastAPI + Ray Serve tile-embedding endpoint that loads a TIMM model from Hugging Face Hub and returns LZ4-compressed embeddings. - Adds a new Helm application entry for
prov-gigapathand registers it invalues.yaml. - Increases
max_queued_requestsfor multiple existing Ray Serve deployments.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| models/prov_gigapath.py | New Prov-GigaPath model deployment with LZ4-in/LZ4-out inference endpoint. |
| helm/rayservice/values.yaml | Registers the new prov-gigapath application in the Helm values list. |
| helm/rayservice/applications/prov-gigapath.yaml | New Ray Serve application config for Prov-GigaPath (resources, autoscaling, user_config). |
| helm/rayservice/applications/virchow2.yaml | Increases request queue limit for Virchow2 deployment. |
| helm/rayservice/applications/prostate-classifier-1.yaml | Increases request queue limit for BinaryClassifier deployment. |
| helm/rayservice/applications/heatmap-builder.yaml | Increases request queue limit for HeatmapBuilder deployment. |
| helm/rayservice/applications/episeg-1.yaml | Increases request queue limit for SemanticSegmentation deployment. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@models/prov_gigapath.py`:
- Around line 86-89: Wrap the LZ4 decompression and image reshaping code that
calls lz4.frame.decompress(await request.body()) and
np.frombuffer(...).reshape(self.tile_size, self.tile_size, 3) in a try/except
that catches RuntimeError and ValueError (and optionally lz4/frame-specific
exceptions), and convert those into an HTTP 400 response with a clear error
message instead of letting them propagate to a 500; ensure the except block
returns the same response type used in this handler (e.g.,
fastapi.HTTPException(status_code=400, detail=... ) or the handler's response
object) and include context (e.g., "invalid compressed image data" or
"decompressed size mismatch") to aid clients and logs.
- Around line 91-93: The code directly calls np.dtype(...) on
request.headers.get("x-output-dtype", "float32") which raises
TypeError/ValueError for invalid strings and allows unsafe dtypes like 'object';
update the logic around the output_dtype assignment to (1) read the header
value, (2) attempt to construct np.dtype inside a try/except catching TypeError
and ValueError, (3) validate the resulting dtype.kind is numeric (e.g.,
dtype.kind in ('f','i','u')) to reject 'O'/'S' etc, and (4) on invalid/unsafe
dtype either return an HTTP 400 error response or fall back to a safe default
like float32; reference the output_dtype variable and the
request.headers.get("x-output-dtype", "float32") call and ensure this prevents
unsafe dtypes from reaching result.tobytes().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 80ccdc44-8b84-4749-b273-1327ff3314ca
📒 Files selected for processing (7)
helm/rayservice/applications/episeg-1.yamlhelm/rayservice/applications/heatmap-builder.yamlhelm/rayservice/applications/prostate-classifier-1.yamlhelm/rayservice/applications/prov-gigapath.yamlhelm/rayservice/applications/virchow2.yamlhelm/rayservice/values.yamlmodels/prov_gigapath.py
There was a problem hiding this comment.
♻️ Duplicate comments (2)
models/prov_gigapath.py (2)
84-87:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle malformed LZ4 payloads and shape mismatches as HTTP 400 (still unresolved).
On Line 84 and Line 85, invalid compressed data or decompressed-size mismatch currently bubbles to 500. This should be translated to a client error.
Proposed minimal fix
-from fastapi import FastAPI, Request, Response +from fastapi import FastAPI, HTTPException, Request, Response @@ - data = await asyncio.to_thread(lz4.frame.decompress, await request.body()) - image = np.frombuffer(data, dtype=np.uint8).reshape( - self.tile_size, self.tile_size, 3 - ) + try: + data = await asyncio.to_thread(lz4.frame.decompress, await request.body()) + image = np.frombuffer(data, dtype=np.uint8).reshape( + self.tile_size, self.tile_size, 3 + ) + except (RuntimeError, ValueError) as exc: + raise HTTPException( + status_code=400, + detail=f"Invalid compressed image payload: {exc}", + ) from exc🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@models/prov_gigapath.py` around lines 84 - 87, The decompression and reshape steps (the lz4.frame.decompress call and np.frombuffer(...).reshape(...)/self.tile_size usage) can raise lz4.frame.LZ4FrameError or ValueError and currently propagate as 500; catch these specific exceptions around the decompress + frombuffer/reshape block and translate them into a client 400 response by raising FastAPI/Starlette HTTPException(status_code=400, detail="Malformed LZ4 payload or size mismatch") (add the import for HTTPException if needed). Ensure you catch lz4.frame.LZ4FrameError and ValueError (or numpy/struct shape errors) and raise the HTTPException rather than letting the exception bubble.
89-91:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate
x-output-dtypeand reject unsafe/invalid dtypes (still unresolved).On Line 89, direct
np.dtype(...)on untrusted header input can raise and return 500; it also permits non-numeric dtypes (for example object/string kinds) that should not be serialized here.Proposed minimal fix
- output_dtype = np.dtype( - request.headers.get("x-output-dtype", "float32").lower() - ) + dtype_raw = request.headers.get("x-output-dtype", "float32").lower() + try: + output_dtype = np.dtype(dtype_raw) + except (TypeError, ValueError) as exc: + raise HTTPException( + status_code=400, + detail=f"Unsupported x-output-dtype: {dtype_raw!r}", + ) from exc + if output_dtype.kind not in {"f", "i", "u"}: + raise HTTPException( + status_code=400, + detail=f"Non-numeric x-output-dtype not supported: {dtype_raw!r}", + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@models/prov_gigapath.py` around lines 89 - 91, The current code calls np.dtype on untrusted header input (request.headers.get("x-output-dtype")) which can raise and also allow unsafe non-numeric kinds; fix by validating and sanitizing the header: parse the header inside a try/except around np.dtype(...) to catch invalid strings, then ensure the resulting dtype.kind is one of the numeric kinds (e.g., 'i','u','f','c') or check against a small whitelist of allowed dtype names (e.g., "float32","float64","int32", etc.); if parsing fails or the kind is not numeric, reject the request with a 4xx error (do not allow object/string dtypes) and only set output_dtype when validation passes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@models/prov_gigapath.py`:
- Around line 84-87: The decompression and reshape steps (the
lz4.frame.decompress call and np.frombuffer(...).reshape(...)/self.tile_size
usage) can raise lz4.frame.LZ4FrameError or ValueError and currently propagate
as 500; catch these specific exceptions around the decompress +
frombuffer/reshape block and translate them into a client 400 response by
raising FastAPI/Starlette HTTPException(status_code=400, detail="Malformed LZ4
payload or size mismatch") (add the import for HTTPException if needed). Ensure
you catch lz4.frame.LZ4FrameError and ValueError (or numpy/struct shape errors)
and raise the HTTPException rather than letting the exception bubble.
- Around line 89-91: The current code calls np.dtype on untrusted header input
(request.headers.get("x-output-dtype")) which can raise and also allow unsafe
non-numeric kinds; fix by validating and sanitizing the header: parse the header
inside a try/except around np.dtype(...) to catch invalid strings, then ensure
the resulting dtype.kind is one of the numeric kinds (e.g., 'i','u','f','c') or
check against a small whitelist of allowed dtype names (e.g.,
"float32","float64","int32", etc.); if parsing fails or the kind is not numeric,
reject the request with a 4xx error (do not allow object/string dtypes) and only
set output_dtype when validation passes.
Summary by CodeRabbit
New Features
Performance Improvements
Deployment