Sitemap

Analyzing 5xx Errors During Kubernetes Rolling Deployment

3 min readJan 27, 2025

Reference: https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-best-practices-terminating-with-grace

https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2106

https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.1/deploy/pod_readiness_gate/

Recently I was doing some testing and observed 5xx errors during Kubernetes Rolling Deployment. Basically when a pod is getting terminated and your application is not configured to handle it gracefully, you may encounter 5xx errors.

Let’s do the setup and do some testing to analyze the issue.

Step 1: Follow Step1–9 given at https://aws.plainenglish.io/cloudfront-blue-green-deployment-using-gitlab-where-origin-is-alb-eks-se-8f2d95b14ffd to create an nginx deployment(you don’t need blue/green deployment for this test).

Instead of index.html, create app.py with following code

from flask import Flask

app = Flask(__name__)

# Define a route for the home page
@app.route("/")
def home():
return "Hello, World! Welcome to the Flask App!"

# Define a health check endpoint
@app.route("/health")
def health():
return "OK", 200

if __name__ == "__main__":
# Run the app

and Dockerfile like this

# Use the official Python image as the base image
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the application code to the container
COPY app.py /app/

# Install dependencies
RUN pip install flask gunicorn

# Expose the port that Gunicorn will run on
EXPOSE 80

# Command to start Gunicorn with 2 workers and proper signal handling
CMD ["gunicorn", "-w", "2", "-b", "0.0.0.0:80", "--timeout", "30", "--graceful-timeout", "30", "--log-level", "debug", "app:app"]

Here we are using gunicorn to run our application with graceful shutdown.

Step 2: Edit nginx deployment to add preStop lifecycle hook and liveness/readiness probes.

spec:
containers:
- image: vinycoolguy/mypyapp:v14
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /bin/sleep
- "10"
livenessProbe:
failureThreshold: 2
httpGet:
path: /
port: 80
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: nginx
ports:
- containerPort: 80
protocol: TCP
readinessProbe:
failureThreshold: 2
httpGet:
path: /health
port: 80
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30

and

change Target group deregistration delay to 30 seconds(same as Pod terminationGracePeriodSeconds).

Also configure health check settings like this. Make sure readiness probe’s failureThreshold/periodSeconds and Target group unhealthy threshold/interval are same.

Press enter or click to view image in full size

Now do a rollout restart or a fresh deployment and monitor the status. I did a couple of deployments and didn’t see any 5xx error.

Press enter or click to view image in full size

You may also use k6 for load testing. For this install k6

dnf install https://dl.k6.io/rpm/repo.rpm
dnf install k6

and create a file named script.js

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
vus: 10, // number of virtual users
duration: '180s', // total test duration
};

export default function () {
const res = http.get('<ALB-DNS>');

// basic check to validate response
check(res, {
'status is 200': (r) => r.status === 200,
});

}

and then perform a rolling restart and run the script

k6 run script.js

Tried kubectl rollout restart deployment/nginx-nginx-chart a few times while running the load test with k6 and didn’t encounter any 5xx error.

Press enter or click to view image in full size

--

--

Vinayak Pandey
Vinayak Pandey

Written by Vinayak Pandey

Experienced Cloud Engineer with a knack of automation. Linkedin profile: https://www.linkedin.com/in/vinayakpandeyit/

No responses yet