Golang on alikhil

Why Graceful Shutdown Matters in Kubernetes

Sun, 01 Jun 2025 20:31:06 +0300

Have you ever deployed a new version of your app in Kubernetes and noticed errors briefly spiking during rollout? Many teams do not even realize this is happening, especially if they are not closely monitoring their error rates during deployments.

There is a common misconception in the Kubernetes world that bothers me. The official Kubernetes documentation and most guides claim that “if you want zero downtime upgrades, just use rolling update mode on deployments”. I have learned the hard way that this simply it is not true - rolling updates alone are NOT enough for true zero-downtime deployments.

And it is not just about deployments. Your pods can be terminated for many other reasons: scaling events, node maintenance, preemption, resource constraints, and more. Without proper graceful shutdown handling, any of these events can lead to dropped requests and frustrated users.

In this post, I will share what I have learned about implementing proper graceful shutdown in Kubernetes. I will show you exactly what happens behind the scenes, provide working code examples, and back everything with real test results that clearly demonstrate the difference.

The Problem: Hidden Errors During Pod Termination

ChatGPT: draw funny picture of Kubernetes pod gracefully shutting down

If you are running services on Kubernetes, you have probably noticed that even with rolling updates (where Kubernetes gradually replaces pods), you might still see errors during deployment. This is especially annoying when you are trying to maintain “zero-downtime” systems.

When Kubernetes needs to terminate a pod (for any reason), it follows this sequence:

Sends a SIGTERM signal to your container
Waits for a grace period (30 seconds by default)
If the container does not exit after the grace period, it gets brutal and sends a SIGKILL signal

The problem? Most applications do not properly handle that SIGTERM signal. They just die immediately, dropping any in-flight requests. In the real world, while most API requests complete in 100-300ms, there are often those long-running operations that take 5-15 seconds or more. Think about processing uploads, generating reports, or running complex database queries. When these longer operations get cut off, that’s when users really feel the pain.

When Does Kubernetes Terminate Pods?

Rolling updates are just one scenario where your pods might be terminated. Here are other common situations that can lead to pod terminations:

Horizontal Pod Autoscaler Events: When HPA scales down during low-traffic periods, some pods get terminated.
Resource Pressure: If your nodes are under resource pressure, the Kubernetes scheduler might decide to evict certain pods.
Node Maintenance: During cluster upgrades, node draining causes many pods to be evicted.
Spot/Preemptible Instances: If you are using cost-saving node types like spot instances, these can be reclaimed with minimal notice.

All these scenarios follow the same termination process, so implementing proper graceful shutdown handling protects you from errors in all of these cases - not just during upgrades.

Let’s Test It: Basic vs. Graceful Service

Instead of just talking about theory, I built a small lab to demonstrate the difference between proper and improper shutdown handling. I created two nearly identical Go services:

Basic Service: A standard HTTP server with no special shutdown handling
Graceful Service: The same service but with proper SIGTERM handling

Both services:

Process requests that take about 4 seconds to complete (intentionally configured for easier demonstration)
Run in the same Kubernetes cluster with identical configurations
Serve the same endpoints

I specifically chose a 4-second processing time to make the problem obvious. While this might seem long compared to typical 100-300ms API calls, it perfectly simulates those problematic long-running operations that occur in real-world applications. The only difference between the services is how they respond to termination signals.

To test them, I wrote a simple k6 script that hammers both services with requests while triggering rolling restart of service’s deployment. Here is what happened:

Basic Service: The Error-Prone One

checks_total.......................: 695    11.450339/s
checks_succeeded...................: 97.98% 681 out of 695
checks_failed......................: 2.01%  14 out of 695

✗ status is 200
  ↳  97% — ✓ 681 / ✗ 14

http_req_failed....................: 2.01%  14 out of 696

Graceful Service: The Reliable One

checks_total.......................: 750     11.724824/s
checks_succeeded...................: 100.00% 750 out of 750
checks_failed......................: 0.00%   0 out of 750

✓ status is 200

http_req_failed........... ........: 0.00%  0 out of 751

The results speak for themselves. The basic service dropped 14 requests during the update (that is 2% of all traffic), while the graceful service handled everything perfectly without a single error.

You might think “2% it is not that bad” — but if you are doing several deployments per day and have thousands of users, that adds up to a lot of errors. Plus, in my experience, these errors tend to happen at the worst possible times.

So How Do We Fix It? The Graceful Shutdown Recipe

After digging into this problem and testing different solutions, I have put together a simple recipe for proper graceful shutdown. While my examples are in Go, the fundamental principles apply to any language or framework you are using.

Here are the key ingredients:

1. Listen for SIGTERM Signals

First, your app needs to catch that SIGTERM signal instead of ignoring it:

// Set up channel for shutdown signals
stop := make(chan os.Signal, 1)
signal.Notify(stop, os.Interrupt, syscall.SIGTERM)

// Block until we receive a shutdown signal
<-stop
log.Println("Shutdown signal received")

This part is easy - you are just telling your app to wake up when Kubernetes asks it to shut down.

2. Track Your In-Flight Requests

You need to know when it is safe to shut down, so keep track of ongoing requests:

// Create a request counter
var inFlightRequests atomic.Int64

http.HandleFunc("/process", func(w http.ResponseWriter, r *http.Request) {
    // Increment counter when request starts
    inFlightRequests.Add(1)
    // do not forget to decrement when done!
    defer inFlightRequests.Add(-1)

    // Your normal request handling...
    time.Sleep(4 * time.Second)  // Simulating long-running work
})

This counter lets you check if there are still requests being processed before shutting down. it is especially important for those long-running operations that users have already waited several seconds for - the last thing they want is to see an error right before completion!

3. Separate Your Health Checks

Here is a commonly overlooked trick - you need different health check endpoints for liveness and readiness:

// Track shutdown state
var isShuttingDown atomic.Bool

// Readiness probe - returns 503 when shutting down
http.HandleFunc("/ready", func(w http.ResponseWriter, r *http.Request) {
    if isShuttingDown.Load() {
        w.WriteHeader(http.StatusServiceUnavailable)
        fmt.Fprintf(w, "Shutting down, not ready")
        return
    }

    w.WriteHeader(http.StatusOK)
    fmt.Fprintf(w, "Ready for traffic")
})

// Liveness probe - always returns 200 (we are still alive!)
http.HandleFunc("/alive", func(w http.ResponseWriter, r *http.Request) {
    w.WriteHeader(http.StatusOK)
    fmt.Fprintf(w, "I'm alive")
})

This separation is crucial. The readiness probe tells Kubernetes to stop sending new traffic, while the liveness probe says “do not kill me yet, I’m still working!”

4. The Shutdown Dance

Now for the most important part - the shutdown sequence:

// Step 1: Mark service as shutting down
isShuttingDown.Store(true)

// Step 2: Let Kubernetes notice the readiness probe failing
time.Sleep(5 * time.Second)

// Step 3: Wait for in-flight requests to finish
for inFlightRequests.Load() > 0 {
    time.Sleep(1 * time.Second)
}

// Step 4: Finally, shut down the server gracefully
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()

if err := server.Shutdown(ctx); err != nil {
    log.Fatalf("Forced shutdown: %v", err)
}

I’ve found this sequence to be optimal. First, we mark ourselves as “not ready” but keep running. We pause to give Kubernetes time to notice and update its routing. Then we patiently wait until all in-flight requests finish before actually shutting down the server.

5. Configure Kubernetes Correctly

Do not forget to adjust your Kubernetes configuration:

# Use different probes for liveness and readiness
livenessProbe:
  httpGet:
    path: /alive  # Always returns OK
    port: 8080
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready  # Returns 503 during shutdown
    port: 8080
  periodSeconds: 3
  failureThreshold: 2

# Give pods enough time to shut down gracefully
terminationGracePeriodSeconds: 30 # Stops routing traffic after 2 failed checks (6 seconds)

This tells Kubernetes to wait up to 30 seconds for your app to finish processing requests before forcefully terminating it.

TL; DR; Quick Tips

If you are in a hurry, here are the key takeaways:

Catch SIGTERM Signals: Do not let your app be surprised when Kubernetes wants it to shut down.
Track In-Flight Requests: Know when it is safe to exit by counting active requests.
Split Your Health Checks: Use separate endpoints for liveness (am I running?) and readiness (can I take traffic?).
Fail Readiness First: As soon as shutdown begins, start returning “not ready” on your readiness endpoint.
Wait for Requests: Do not just shut down - wait for all active requests to complete first.
Use Built-In Shutdown: Most modern web frameworks have graceful shutdown options; use them!
Configure Terminaton Grace Period: Give your pods enough time to complete the shutdown sequence.
Test Under Load: You will not catch these issues in simple tests - you need realistic traffic patterns.

Wrap Up: Is It Worth the Extra Code?

You might be wondering if adding all this extra code is really worth it. After all, we’re only talking about a 2% error rate during pod termination events.

From my experience working with high-traffic services, I would say absolutely yes - for three reasons:

User Experience: Even small error rates look bad to users. Nobody wants to see “Something went wrong” messages, especially after waiting 10+ seconds for a long-running operation to complete.
Cascading Failures: Those errors can cascade through your system, especially if services depend on each other. Long-running requests often touch multiple critical systems.
Deployment Confidence: With proper graceful shutdown, you can deploy more frequently without worrying about causing problems.

The good news is that once you have implemented this pattern once, it is easy to reuse across your services. You can even create a small library or template for your organization.

In production environments where I have implemented these patterns, we have gone from seeing a spike of errors with every deployment to deploying multiple times per day with zero impact on users. that is a win in my book!

Unintended Side Effects of Using http.DefaultClient in Go

Wed, 19 Mar 2025 19:20:45 +0300

The Internet is plenty of articles that telling why you should not be using http.DefaultClient in Golang (one, two) but they refer to Timeout and MaxIdleConns settings.

Today I want to share with you another reason why you should avoid using http.DefaultClient in your code.

The Story

As an SRE at Criteo, I both read and write code. Last week, I worked on patching Updatecli — an upgrade automation tool written in Go.

The patch itself was just ~15 lines of code. But then I spent three days debugging a strange authorization bug in an unrelated part of the code.

It happened because of code like this:

client := http.DefaultClient
client.Transport = &transport.PrivateToken{
    Token: s.Token,
    Base:  client.Client.Transport,
}

Since http.DefaultClient is a reference, not a value:

var DefaultClient = &Client{}

The code above is effectively the same as:

http.DefaultClient.Transport = &transport.PrivateToken{
    Token: s.Token,
    Base:  http.DefaultClient.Transport,
}

Later, in a third-party library, I found this:

if opts.Client == nil {
    opts.Client = http.DefaultClient
}

The Fix

To prevent this, I had to change the code to:

client := &http.Client{}
client.Transport = &transport.PrivateToken{
    Token: s.Token,
    Base:  client.Transport,
}

As a result, the patched client with the authorization transport got injected into the third-party library, causing unexpected failures.

Bugs like this are hard to catch just by reading the code, since they involve global state mutation. But could they be detected by linters?

What do you think? How do you find or prevent such issues in your projects?

Go Quickstart

Fri, 06 Apr 2018 16:22:35 +0300

Hi folks! It’s been a long time since I have published the last post, but now I came back with short quickstart guide in Go.

In this tutorial, we will configure Go environment in VS Code and write our first program in Go.

Install Go

The first thing that you need to do it’s to install Go on your computer. To do so, download installer for your operating system from here and then run the installer.

Configure GOPATH

By language convention, Go developers store all their code in a single place called workspace. Go also puts dependency packages in the workspace. So, in order to Go perform correctly, we need to set GOPATH variable with the path to the workspace.

MacOS and Linux

Set the GOPATH envar with workspace

export GOPATH=$HOME/go

Also, we need to add GOPATH/bin to PATH in order to run compiler Go programs:

export PATH=$PATH:$GOPATH/bin

Configure VS Code

Install official Go extension.

Install delve debugger:

go get -u github.com/derekparker/delve/cmd/dlv

I recommend you to add the following lines to your VS Code user settings:

settings.json

{
    "go.autocompleteUnimportedPackages": true,
    "go.formatTool": "gofmt"
}

Windows

Create GOPATH envar:

set GOPATH=c:\Users\%USERNAME%\go

Also, we need to add GOPATH\bin to PATH in order to run compiler Go programs:

set PATH=%PATH%;%GOPATH%\bin

Create project

Move to your GOPATH/src directory. Create a directory for your project:

cd $GOPATH/src
mkdir -p github.com/alikhil/hello-world-with-go

Open it using vscode:

code github.com/alikhil/hello-world-with-go

Hello World

Let’s create a file named program.go and put the following code there:

program.go

package main

import "fmt"

func main() {
    fmt.Println("¡Hola, mundo!")
}

Run the program

Finally, to run the program by pressing the F5 button in VS Code and you should see the message printed to Debug Console.

That’s all! My congratulations, you have just written your first program in Go!

Troubleshooting

If you fail to run your program and there is some message like “Cannot find a path to go”. Try to add to your PATH envar with path directory where go binary is stored.

For example in MacOS I have added following line to my ~/.bash_profile:

export PATH=/usr/local/go/bin:$PATH