-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Checklist:
- I've included steps to reproduce the bug.
- I've included the version of argo rollouts.
Describe the bug
I tried updating an argument for my analysis template using the fieldRef/ valueFrom way. Argo Rollouts controller was not able to find the field under the provided path and the Rollout is stuck in an infinte Progressing state.
The controller has alreay retired close to 5000 times.
This looks like a bug, if the controller is not able find the field due to missing a value or a incorrectly typed path, controller should retry only upto a configurable finite number of times before rolling back and entering a degraded state.
Also, there are not statuses being emited abouth this by the Rollout, only way to check this is through logs.
If this happens in an actual deployment, there should also be a quick way to check it for eg. a status field gets updated which we can look out for/get notifications on.
Also, a separate question; does fieldRef only work for labels on the Rollout resource itself, does it not work for the labels under the template section which get added to the pod.
args:
- name: rollout-version
valueFrom:
fieldRef:
fieldPath: spec.template.metadata.labels['version']
To Reproduce
Use the below Rollout config:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: usage-events-ingester
namespace: usage-events-ingester
labels:
app: usage-events-ingester
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 5}
- analysis:
templates:
- templateName: example-template
args:
- name: rollout-version
valueFrom:
fieldRef:
fieldPath: spec.template.metadata.labels['version']
- setWeight: 100
revisionHistoryLimit: 2
selector:
matchLabels:
app: usage-events-ingester
template:
metadata:
labels:
app: usage-events-ingester
version: "3"
spec:
containers:
- name: rollouts-demo-analysis
image: argoproj/rollouts-demo:blue
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
requests:
memory: 32Mi
cpu: 5m
Expected behavior
Rollout should have rolled back after a configurable number of retries and entered into a "Degraded" state
Screenshots
Version
1.8.3
Logs
time="2025-10-22T07:28:22Z" level=info msg="rollout syncHandler queue retries: 3968 : key \"usage-events-ingester/usage-events-ingester\"" namespace=usage-events-ingester rollout=usage-events-ingester
time="2025-10-22T07:28:22Z" level=error msg="invalid path spec.template.metadata.labels['version'] in rollout" error="<nil>"
time="2025-10-22T07:28:32Z" level=info msg="Started syncing rollout" generation=18 namespace=usage-events-ingester resourceVersion=1119184795 rollout=usage-events-ingester
time="2025-10-22T07:28:32Z" level=info msg="No TrafficRouting Reconcilers found" namespace=usage-events-ingester rollout=usage-events-ingester
time="2025-10-22T07:28:32Z" level=info msg="Reconciling analysis step (stepIndex: 2)" namespace=usage-events-ingester rollout=usage-events-ingester
time="2025-10-22T07:28:32Z" level=error msg="roCtx.reconcile err invalid path spec.template.metadata.labels['version'] in rollout" generation=18 namespace=usage-events-ingester resourceVersion=1119184795 rollout=usage-events-ingester
time="2025-10-22T07:28:32Z" level=info msg="Reconciliation completed" generation=18 namespace=usage-events-ingester resourceVersion=1119184795 rollout=usage-events-ingester time_ms=3.248486
time="2025-10-22T07:28:32Z" level=error msg="rollout syncHandler error: invalid path spec.template.metadata.labels['version'] in rollout" namespace=usage-events-ingester rollout=usage-events-ingester
time="2025-10-22T07:28:32Z" level=info msg="rollout syncHandler queue retries: 3969 : key \"usage-events-ingester/usage-events-ingester\"" namespace=usage-events-ingester rollout=usage-events-ingester
time="2025-10-22T07:28:32Z" level=error msg="invalid path spec.template.metadata.labels['version'] in rollout" error="<nil>"
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.