Health Checks

Updated 24/09/2025

Motivation

Adding a health check to your application is important for ensuring that your application is running smoothly and responding to requests properly. Health checks are used to monitor the status of your application and determine if it is healthy or not. By implementing a health check, you can:

  • Monitor application uptime: Ensure that it is available to users, this is important for applications that are critical to business operations, as downtime can result in lost revenue, decreased productivity, and decreased user satisfaction.

  • Detect issues early: Pickup issues before they become critical. By regularly monitoring the status of your application, you can quickly identify and address any problems that arise, such as server errors or network connectivity issues.

  • Improve application performance: Identify performance issues in your application and optimize it for better performance. By monitoring response times and resource utilization, you can identify bottlenecks and optimize your application for better performance.

  • Enable automatic failover: In conjunction with load balancers you can enable automatic failover. If a health check determines that an instance of your application is unhealthy, the load balancer can automatically redirect traffic to a healthy instance, ensuring that your application remains available to users.

HTTP or TCP?

TCP and HTTP are both protocols used for communication over a network, but they serve different purposes. TCP provides a reliable, ordered, and error-checked delivery of data, while HTTP is a higher-level application protocol that is typically used for web browsing, file transfer, and other web-based applications.

When it comes to health checks in a background service running in a Kubernetes cluster, using TCP as the health check protocol is generally preferred over HTTP for a few reasons:

  • Simplicity: TCP health checks are simpler to implement and have lower overhead than HTTP health checks. With TCP, you simply establish a connection to the target service and check if it responds with a successful handshake. With HTTP, you need to send an HTTP request and wait for an HTTP response, which requires more processing and can be slower.

  • Speed: TCP health checks are generally faster than HTTP health checks because they require fewer network round trips. Since health checks are typically performed frequently, minimizing the time spent on health checks can help improve the overall performance and responsiveness of the system.

  • Accuracy: TCP health checks provide a more accurate view of the availability of the service, since they check if the underlying network connection is available and responsive. With HTTP health checks, a successful response may not necessarily mean that the service is fully operational or responsive.

That being said, there may be situations where using HTTP as the health check protocol is more appropriate. For example, if your service relies on specific HTTP endpoints for functionality, using HTTP health checks may provide more insight into the service’s health and readiness. Ultimately, the choice of health check protocol depends on the specific requirements and constraints of your system.

Probes

In .Net both of these are added with .Services.AddHealthChecks().AddCheck<T> which can be confusing but they are slightly different. Probes can ne HTTP, TCP or a command.

Liveness (Health)

🔴 /health for Liveness Probe
Purpose: “Is my application alive and not stuck?”

What it checks:

✅ Basic application responsiveness
✅ Main process is running and not deadlocked
✅ Core functionality works (very lightweight test)
✅ Application hasn’t corrupted itself

What it should NOT check:

❌ External dependencies (databases, APIs, etc.)
❌ Complex business logic
❌ Expensive operations

Example response:

1
2
3
4
5
{
"status": "healthy",
"timestamp": "2023-03-03T10:30:00Z",
"uptime": "2h 15m"
}

Failure result: Pod gets killed and restarted by Kubernetes.

Readiness (Ready)

🟢 /ready for Readiness Probe
Purpose: “Is my application ready to serve traffic?”

What it checks:

✅ Application fully started up
✅ All required dependencies available (databases, external APIs)
✅ Configuration loaded
✅ Authentication systems accessible
✅ Caches warmed up
✅ Required resources available

1
2
3
4
5
6
7
8
9
{
"status": "ready",
"dependencies": {
"database": "connected",
"external_api": "available",
"cache": "loaded"
},
"timestamp": "2023-03-03T10:30:00Z"
}

Failure result: Kubernetes stops sending traffic to this pod (but doesn’t restart it)

References