-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Bug: Operator incompatible with Kubernetes 1.21 due to StatefulSet availableReplicas field
Description
The RisingWave Operator claims to support Kubernetes 1.21+ in the compatibility matrix, but it fails on Kubernetes 1.21 due to the use of the availableReplicas field in StatefulSet status checks.
This causes the operator to never set the RisingWave cluster running status to true on Kubernetes 1.21, even when all pods are healthy and ready.
Environment
- Kubernetes Version: v1.21.13
- Operator Version: v0.13.0 (also affects latest)
- RisingWave Version: nightly-20251028
Reproduction Steps
- Deploy RisingWave Operator on a Kubernetes 1.21 cluster
- Create a RisingWave cluster
- Verify that all pods are running and ready:
kubectl get pods -n <namespace> # All pods show 1/1 Running
- Check the RisingWave cluster status:
kubectl get risingwave -n <namespace> # RUNNING shows "False" forever
Root Cause
The status.availableReplicas field was added to StatefulSet in Kubernetes 1.22 as an alpha feature (KEP-2599). On Kubernetes 1.21, this field does not exist and defaults to 0.
The operator's readiness check in pkg/utils/apps.go uses this field:
// Line 76-78
if statefulSet.Status.AvailableReplicas < statefulSet.Status.UpdatedReplicas {
return false
}On Kubernetes 1.21:
availableReplicas= 0 (field doesn't exist, defaults to zero value)updatedReplicas= 1 (actual replica count)- Check evaluates to:
0 < 1→ returnsfalse - Operator logs show:
Found not-ready groups, keep waiting... action=WaitBeforeMetaStatefulSetsReady
This causes the operator to perpetually wait for StatefulSets to become ready, never progressing to set the Running condition to true.
Evidence
1. StatefulSet Status (K8s 1.21)
{
"status": {
"replicas": 1,
"readyReplicas": 1,
"currentReplicas": 1,
"updatedReplicas": 1
// availableReplicas field is missing
}
}2. Operator Logs
INFO Found not-ready groups, keep waiting... action=WaitBeforeMetaStatefulSetsReady component=meta group=""
INFO Found not-ready groups, keep waiting... action=WaitBeforeComputeStatefulSetsReady component=compute group=""
3. RisingWave Status
status:
conditions:
- type: Initializing
status: "True"
- type: Running
status: "False" # Never becomes True
componentReplicas:
meta:
running: 1
target: 1
compute:
running: 1
target: 1Impact
- Users on Kubernetes 1.21 cannot use the operator despite documentation claiming support
- Clusters appear unhealthy even when fully functional
- Breaks automated workflows that depend on the
Runningstatus
Expected Behavior
The operator should correctly detect StatefulSet readiness on Kubernetes 1.21 as documented in the compatibility matrix.
Proposed Solution
Fix the code to support K8s 1.21 by falling back to readyReplicas (available since K8s 1.9) when availableReplicas is not available:
// Use availableReplicas if available (K8s 1.22+), otherwise fall back to readyReplicas
readyCount := statefulSet.Status.AvailableReplicas
if readyCount == 0 && statefulSet.Status.ReadyReplicas > 0 {
// Fall back to readyReplicas for K8s 1.21
readyCount = statefulSet.Status.ReadyReplicas
}
if readyCount < statefulSet.Status.UpdatedReplicas {
return false
}This ensures backward compatibility with K8s 1.21 while maintaining optimal behavior on K8s 1.22+.
Additional Context
The same issue affects similar checks for:
IsStatefulSetRolledOut()- line 76IsAdvancedStatefulSetRolledOut()- line 146 (for OpenKruise)
Note: availableReplicas has been available in Deployment since K8s 1.14, so only the StatefulSet checks are affected.
Related Files
pkg/utils/apps.go- Contains the problematic readiness checksREADME.md- Documents K8s 1.21+ compatibility
Timeline
- K8s 1.21: No
availableReplicasfield in StatefulSet - K8s 1.22:
availableReplicasadded as alpha (KEP-2599, turned on by default) - K8s 1.23: Beta
- K8s 1.25: GA/Stable
I'm happy to submit a PR to fix this issue.