|
| 1 | +--- |
| 2 | +title: "Pod Readiness Gate" |
| 3 | +sidebar_position: 40 |
| 4 | +--- |
| 5 | + |
| 6 | +AWS Load Balancer controller supports Pod readiness gates to indicate that pod is registered to the ALB/NLB and healthy to receive traffic. The controller automatically injects the necessary readiness gate configuration to the pod spec via mutating webhook during pod creation. |
| 7 | + |
| 8 | +If the new pod start more quickly than the healthcheck, EKS will terminate all pods and this results in outage. |
| 9 | + |
| 10 | +:::info |
| 11 | +Note that This only works with `target-type: ip`, since when using `target-type: instance`, it's the node used as backend, the ALB itself is not aware of pod/podReadiness in such case. |
| 12 | +::: |
| 13 | + |
| 14 | +The current ui service is not using readiness gate (last column is set to `<none>`). |
| 15 | +These informations are only visible on wide output: |
| 16 | + |
| 17 | +```bash |
| 18 | +$ kubectl -n ui get pods --output wide |
| 19 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 20 | +ui-5989474687-swm27 1/1 Running 0 2m24s 10.42.181.33 ip-10-42-176-252.us-west-2.compute.internal <none> <none> |
| 21 | +``` |
| 22 | + |
| 23 | +We will observe the current situation by doing a rollout the deployment. |
| 24 | +You'll notice that the old pod id terminated immediately after being `Ready`. |
| 25 | +If you'll be quick, you can observe the healcheck status of the new pod in the ALB target group: |
| 26 | + |
| 27 | +```bash |
| 28 | +$ kubectl -n ui get pods --output wide |
| 29 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 30 | +ui-6dbf768d69-vx2cz 1/1 Running 0 20s 10.42.142.144 ip-10-42-137-174.us-west-2.compute.internal <none> <none> |
| 31 | +$ kubectl -n ui rollout restart deployment ui |
| 32 | +deployment.apps/ui restarted |
| 33 | +$ kubectl -n ui get pods --output wide |
| 34 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 35 | +ui-5d5c6b587d-x5pgz 1/1 Running 0 2s 10.42.181.37 ip-10-42-176-252.us-west-2.compute.internal <none> <none> |
| 36 | +ui-6dbf768d69-vx2cz 1/1 Terminating 0 30s 10.42.142.144 ip-10-42-137-174.us-west-2.compute.internal <none> <none> |
| 37 | +$ kubectl -n ui get pods --output wide |
| 38 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 39 | +ui-5d5c6b587d-x5pgz 1/1 Running 0 6s 10.42.181.37 ip-10-42-176-252.us-west-2.compute.internal <none> <none> |
| 40 | +TG_ARN=$(aws elbv2 describe-target-groups --query "TargetGroups[?contains(TargetGroupName, 'k8s-ui-ui')].TargetGroupArn" --output text) |
| 41 | +aws elbv2 describe-target-health --target-group-arn $TG_ARN --query "TargetHealthDescriptions[].TargetHealth |
| 42 | +``` |
| 43 | +
|
| 44 | +output as: |
| 45 | +
|
| 46 | +```json |
| 47 | +{ |
| 48 | + "State": "initial", |
| 49 | + "Reason": "Elb.RegistrationInProgress", |
| 50 | + "Description": "Target registration is in progress" |
| 51 | +} |
| 52 | +``` |
| 53 | +
|
| 54 | +and after some seconds: |
| 55 | +
|
| 56 | +```json |
| 57 | +{ |
| 58 | + "State": "healthy" |
| 59 | +} |
| 60 | +``` |
| 61 | +
|
| 62 | +During this delay our ui application will be unreachable (502 errors). |
| 63 | +
|
| 64 | +In order to avoid this situation, the AWS Load Balancer controller can set the readiness condition on the pods that constitute your ingress or service backend. The condition status on a pod will be set to `True` only when the corresponding target in the ALB/NLB target group shows a health state of `Healthy`. This prevents the rolling update of a deployment from terminating old pods until the newly created pods are `Healthy` in the ALB/NLB target group and ready to take traffic. |
| 65 | +
|
| 66 | +For readiness gate configuration to be injected to the pod spec, you need to apply the label `elbv2.k8s.aws/pod-readiness-gate-inject: enabled` to the pod namespace: |
| 67 | +
|
| 68 | +```bash |
| 69 | +$ kubectl label namespace ui elbv2.k8s.aws/pod-readiness-gate-inject=enabled |
| 70 | +namespace/ui labeled |
| 71 | +``` |
| 72 | +
|
| 73 | +We need to rollout the deployment to enable it: |
| 74 | +
|
| 75 | +```bash |
| 76 | +$ kubectl -n ui rollout restart deployment ui |
| 77 | +``` |
| 78 | +
|
| 79 | +You can observe that the `Ready` status is `False` as the target health: |
| 80 | +```bash |
| 81 | +$ kubectl describe pod -n ui -l app.kubernetes.io/name=ui | grep -A 10 "Conditions:" |
| 82 | +Conditions: |
| 83 | + Type Status |
| 84 | + target-health.elbv2.k8s.aws/k8s-ui-ui-b21a807597 False |
| 85 | + PodReadyToStartContainers True |
| 86 | + Initialized True |
| 87 | + Ready False |
| 88 | + ContainersReady True |
| 89 | + PodScheduled True |
| 90 | +``` |
| 91 | +
|
| 92 | +After the target healthcheck is Ready: |
| 93 | +
|
| 94 | +```bash |
| 95 | +$ kubectl describe pod -n ui -l app.kubernetes.io/name=ui | grep -A 10 "Conditions:" |
| 96 | +Conditions: |
| 97 | + Type Status |
| 98 | + target-health.elbv2.k8s.aws/k8s-ui-ui-b21a807597 True |
| 99 | + PodReadyToStartContainers True |
| 100 | + Initialized True |
| 101 | + Ready True |
| 102 | + ContainersReady True |
| 103 | + PodScheduled True |
| 104 | +``` |
| 105 | +
|
| 106 | +Now the pod has readiness gate enabled, we can observe that the old pod isn't terminated unless the readiness success on the new pod if we do another rollout deployment: |
| 107 | +
|
| 108 | +```bash |
| 109 | +$ kubectl -n ui rollout restart deployment ui |
| 110 | +deployment.apps/ui restarted |
| 111 | +$ kubectl -n ui get pods --output wide |
| 112 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 113 | +ui-886bb64b8-cxmsx 1/1 Running 0 103s 10.42.158.114 ip-10-42-137-174.us-west-2.compute.internal <none> 1/1 |
| 114 | +[...] |
| 115 | +$ kubectl -n ui get pods --output wide |
| 116 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 117 | +ui-886bb64b8-cxmsx 1/1 Running 0 112s 10.42.158.114 ip-10-42-137-174.us-west-2.compute.internal <none> 1/1 |
| 118 | +ui-6fd4c6cc49-f8tqm 1/1 Running 0 3s 10.42.181.33 ip-10-42-176-252.us-west-2.compute.internal <none> 0/1 |
| 119 | +[...] |
| 120 | +$ kubectl -n ui get pods --output wide |
| 121 | +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES |
| 122 | +ui-6fd4c6cc49-f8tqm 1/1 Running 0 66s 10.42.181.33 ip-10-42-176-252.us-west-2.compute.internal <none> 1/1 |
| 123 | +``` |
| 124 | +
|
| 125 | +This let the ui still reachable during the rollout. |
0 commit comments