ci(runpod): auto-update template image after worker push#65
Merged
grossir merged 3 commits intoMay 18, 2026
Merged
Conversation
Member
Author
|
Tested the new step via workflow_dispatch from this branch:
Then removed the personal secrets. Ready for the production RUNPOD_API_KEY / RUNPOD_TEMPLATE_ID to be added before merge. |
Contributor
|
@quevon24 Both RUNPOD_API_KEY and RUNPOD_TEMPLATE_ID are missing from the repo secrets. Should we open a ticket on the infra repo? Or can you add the values? I have permissions to add secrets |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After the
Build and Push RunPod Workerworkflow uploads a new image to Docker Hub, it PATCHes the RunPod template'simageNameto the SHA-pinned tag viarest.runpod.io/v1/templates/<id>. Next cold-start workers pull the new image automatically, replacing the manual "edit image tag in the RunPod UI" step.Setup before merging
RunPod
Console → Settings → API Keys → Create API Key with Restricted permission. Set GraphQL: Read/Write and leave AI API: None. Copy the key.
Why GraphQL R/W: RunPod exposes only two scopes on restricted keys, GraphQL (the management plane covering templates, endpoints, pods, secrets) and AI API (per-endpoint scope, only for invoking serverless jobs). Updating a template is a management operation, so it requires GraphQL R/W; the REST
/v1/templates/<id>route used by the workflow is a thin shim over the GraphQLsaveTemplatemutation internally. There is no per-template scope, so GraphQL R/W is the tightest setting that lets us do this. It does, however, grant full management of every template, endpoint, and pod on the account, so treat the key as a production credential: store only in GitHub Actions secrets, never echo to logs, and rotate if it leaks.Find the template ID. Endpoints created manually still have a backing template, hidden from the default listing, surface it with the
includeEndpointBoundTemplatesflag:Find the entry where
isServerless: trueandnamematches your endpoint (e.g.Blackletter gpu worker). Itsidis the template ID.GitHub
3. Repo → Settings → Secrets and variables → Actions → New repository secret. Add both:
RUNPOD_API_KEYfrom step 1RUNPOD_TEMPLATE_IDfrom step 2Verifying after merge
Run the workflow manually (Actions → Build and Push RunPod Worker → Run workflow) and confirm the
Update RunPod template imagestep logsHTTP 200. The template in the RunPod UI should then show the newfreelawproject/blackletter-gpu-worker:<sha>tag, and the next job dispatched to the endpoint will cold-start on the new image.Notes
Min Workers = 0: warm workers drain on the 5-min idle timeout and the next job picks up the new image. If we raiseMin Workers > 0later, we'll need to add an explicit roll step to force existing workers to recycle.scanning/runpod/**(plusworkflow_dispatch), unchanged from before.Closes #55