healthcheck

package module

v0.9.0 Latest Latest Go to latest Published: Feb 15, 2026 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/kazhuravlev/healthcheck

Links

Open Source Insights

README ¶

Healthcheck for Go Applications

A production-ready health check library for Go applications that enables proper monitoring and graceful degradation in modern cloud environments, especially Kubernetes.

Why Health Checks Matter

Health checks are critical for building resilient, self-healing applications in distributed systems. They provide:

Automatic Recovery: In Kubernetes, failed health checks trigger automatic pod restarts, ensuring your application recovers from transient failures without manual intervention.
Load Balancer Integration: Health checks prevent traffic from being routed to unhealthy instances, maintaining service quality even during partial outages.
Graceful Degradation: By monitoring dependencies (databases, caches, external APIs), your application can degrade gracefully when non-critical services fail.
Operational Visibility: Health endpoints provide instant insight into system state, making debugging and incident response faster.
Zero-Downtime Deployments: Readiness checks ensure new deployments only receive traffic when fully initialized.

Features

Multiple Check Types: Basic (sync), Manual, and Background (async) checks for different use cases
Kubernetes Native: Built-in /live and /ready endpoints following k8s conventions
JSON Status Reports: Detailed health status with history for debugging
Metrics Integration: Callbacks for Prometheus or other monitoring systems
Thread-Safe: Concurrent-safe operations with proper synchronization
Graceful Shutdown: Proper cleanup of background checks and shutdown signaling
Check History: Last 5 states stored for each check for debugging

Installation

go get -u github.com/kazhuravlev/healthcheck

Quick Start

package main

import (
	"context"
	"errors"
	"math/rand"
	"time"

	"github.com/kazhuravlev/healthcheck"
)

func main() {
	ctx := context.TODO()

	// 1. Create healthcheck instance
	hc, _ := healthcheck.New()

	// 2. Register a simple check
	hc.Register(ctx, healthcheck.NewBasic("redis", time.Second, func(ctx context.Context) error {
		if rand.Float64() > 0.5 {
			return errors.New("service is not available")
		}
		return nil
	}))

	// 3. Start HTTP server
	server, _ := healthcheck.NewServer(hc, healthcheck.WithPort(8080))
	_ = server.Run(ctx)

	// 4. Check health at http://localhost:8080/ready
	select {}
}

Types of Health Checks

1. Basic Checks (Synchronous)

Basic checks run on-demand when the /ready endpoint is called. Use these for:

Fast operations (< 1 second)
Checks that need fresh data
Low-cost operations

// Database connectivity check
dbCheck := healthcheck.NewBasic("postgres", time.Second, func (ctx context.Context) error {
  return db.PingContext(ctx)
})

2. Background Checks (Asynchronous)

Background checks run periodically in a separate goroutine (in background mode). Use these for:

Expensive operations (API calls, complex queries)
Checks with rate limits (when checks running rarely than k8s requests to /ready)
Operations that can use slightly stale data

// External API health check - runs every 30 seconds
apiCheck := healthcheck.NewBackground(
  "payment-api",
  nil, // initial error state
  5*time.Second, // initial delay
  30*time.Second, // check interval
  5*time.Second,  // timeout per check
  func (ctx context.Context) error {
    resp, err := client.Get("https://api.payment.com/health")
    if err != nil {
      return err
    }
    defer resp.Body.Close()
    if resp.StatusCode != 200 {
      return errors.New("unhealthy")
    }
    return nil
  },
)

3. Manual Checks

Manual checks are controlled by your application logic. Use these for:

Initialization states (cache warming, data loading)
Circuit breaker patterns
Feature flags

// Cache warming check
cacheCheck := healthcheck.NewManual("cache-warmed")
hc.Register(ctx, cacheCheck)

// Set unhealthy during startup
cacheCheck.SetErr(errors.New("cache warming in progress"))

// After cache is warmed
cacheCheck.SetErr(nil)

Best Practices

1. Choose the Right Check Type

Scenario	Check Type	Why
Database ping	Basic	Fast, needs fresh data
File system check	Basic	Fast, local operation
External API health	Background	Expensive, rate-limited
Message queue depth	Background	Metrics query, can be stale
Cache warmup status	Manual	Application-controlled state

2. Set Appropriate Timeouts

// ❌ Bad: Too long timeout blocks readiness. Timeout should less than timeout in k8s
healthcheck.NewBasic("db", 30*time.Second, checkFunc)

// ✅ Good: Short timeout
healthcheck.NewBasic("db", 1*time.Second, checkFunc)

3. Use Status Codes Correctly

Liveness (/live): Should almost always return 200 OK
- Only fail if the application is in an unrecoverable state
- Kubernetes will restart the pod on failure
Readiness (/ready): Should fail when:
- Critical dependencies are unavailable
- Application is still initializing
- Application is shutting down

4. Add Context to Errors

func checkDatabase(ctx context.Context) error {
  if err := db.PingContext(ctx); err != nil {
    // Use fmt.Errorf to add context. It will be available in /ready report
	return fmt.Errorf("postgres connection failed: %w", err)
  }

  return nil
}

5. Graceful Shutdown

For applications that need to signal they are shutting down (preventing new traffic while completing existing requests), use the Shutdown() method:

// Create healthcheck instance
hc, _ := healthcheck.New()

// Register your normal checks
hc.Register(ctx, healthcheck.NewBasic("database", time.Second, checkDB))

// Start HTTP server
server, _ := healthcheck.NewServer(hc, healthcheck.WithPort(8080))
_ = server.Run(ctx)

// In your graceful shutdown handler
func gracefulShutdown(hc *healthcheck.Healthcheck) {
  // Mark application as shutting down - /ready will return 500
  hc.Shutdown()

  // Continue with your normal shutdown process
  // - Stop accepting new requests
  // - Complete existing requests
  // - Close database connections, etc.
}

What happens after Shutdown():

/ready endpoint immediately returns HTTP 500 with status "down"
A special __shutting_down__ check is added to the response
Kubernetes will stop routing new traffic to this pod
/live endpoint continues to return 200 OK (pod should not be restarted)

Use this pattern for:

Zero-downtime deployments
Graceful pod termination in Kubernetes
Maintenance mode activation
When you need to drain traffic before shutdown

6. Monitor Checks

hc, _ := healthcheck.New(
  healthcheck.WithCheckStatusHook(func (name string, status healthcheck.Status) {
    // hcMetric can be a prometheus metric - it is up to your infrastructure
	hcMetric.WithLabelValues(name, string(status)).Set(1)
  }),
)

Complete Example

package main

import (
	"context"
	"database/sql"
	"fmt"
	"log"
	"time"

	"github.com/kazhuravlev/healthcheck"
	_ "github.com/lib/pq"
)

func main() {
	ctx := context.Background()

	// Initialize dependencies
	db, err := sql.Open("postgres", "postgres://localhost/myapp")
	if err != nil {
		log.Fatal(err)
	}

	// Create healthcheck
	hc, _ := healthcheck.New()

	// 1. Database check - synchronous, critical
	hc.Register(ctx, healthcheck.NewBasic("postgres", time.Second, func(ctx context.Context) error {
		return db.PingContext(ctx)
	}))

	// 2. Cache warmup - manual control
	cacheReady := healthcheck.NewManual("cache")
	hc.Register(ctx, cacheReady)
	cacheReady.SetErr(fmt.Errorf("warming up"))

	// 3. External API - background check
	hc.Register(ctx, healthcheck.NewBackground(
		"payment-provider",
		nil,
		10*time.Second, // initial delay
		30*time.Second, // check interval
		5*time.Second,  // timeout
		checkPaymentProvider,
	))

	// Start health check server
	server, _ := healthcheck.NewServer(hc, healthcheck.WithPort(8080))
	if err := server.Run(ctx); err != nil {
		log.Fatal(err)
	}

	// Simulate cache warmup completion
	go func() {
		time.Sleep(5 * time.Second)
		cacheReady.SetErr(nil)
		log.Println("Cache warmed up")
	}()

	// Graceful shutdown example
	go func() {
		time.Sleep(30 * time.Second)
		log.Println("Initiating graceful shutdown...")
		hc.Shutdown() // /ready will now return 500, stopping new traffic
		log.Println("Application marked as shutting down")
	}()

	log.Println("Health checks available at:")
	log.Println("  - http://localhost:8080/live")
	log.Println("  - http://localhost:8080/ready")

	select {}
}

func checkPaymentProvider(ctx context.Context) error {
	// Implementation of payment provider check
	return nil
}

Integration with Kubernetes

apiVersion: v1
kind: Pod
spec:
  containers:
    - name: app
      livenessProbe:
        httpGet:
          path: /live
          port: 8080
        initialDelaySeconds: 10
        periodSeconds: 10
        timeoutSeconds: 5
        failureThreshold: 3
      readinessProbe:
        httpGet:
          path: /ready
          port: 8080
        initialDelaySeconds: 5
        periodSeconds: 5
        timeoutSeconds: 3
        failureThreshold: 2

Response Format

The /ready endpoint returns detailed JSON with check history:

Healthy application:

{
	"status": "up",
	"checks": [
		{
			"name": "postgres",
			"state": {
				"status": "up",
				"error": "",
				"timestamp": "2024-01-15T10:30:00Z"
			},
			"history": [
				{
					"status": "up",
					"error": "",
					"timestamp": "2024-01-15T10:29:55Z"
				}
			]
		}
	]
}

Application shutting down:

{
	"status": "down",
	"checks": [
		{
			"name": "postgres",
			"state": {
				"status": "up",
				"error": "",
				"timestamp": "2024-01-15T10:30:00Z"
			}
		},
		{
			"name": "__shutting_down__",
			"state": {
				"status": "down",
				"error": "The application in shutting down process",
				"timestamp": "2024-01-15T10:30:05Z"
			},
			"history": null
		}
	]
}

Documentation ¶

Index ¶

func LiveHandler() http.HandlerFunc
func NewBackground(name string, initialErr error, delay, period, timeout time.Duration, ...) *bgCheck
func NewBasic(name string, timeout time.Duration, fn CheckFn) *basicCheck
func NewManual(name string) *manualCheck
func ReadyHandler(healthcheck IHealthcheck) http.HandlerFunc
func WithCheckStatusFn(fn func(checkID string, isReady Status)) func(*hcOptions)
func WithHealthcheck(hc *Healthcheck) func(o *serverOptions)
func WithLogger(logger *slog.Logger) func(o *serverOptions)
func WithPort(port int) func(o *serverOptions)
type Check
type CheckFn
type CheckState
type Healthcheck
- func New(opts ...func(*hcOptions)) (*Healthcheck, error)
- func (s *Healthcheck) Register(ctx context.Context, check ICheck)
- func (s *Healthcheck) RunAllChecks(ctx context.Context) Report
- func (s *Healthcheck) Shutdown()
type ICheck
type IHealthcheck
type ILogger
type Report
type Server
- func NewServer(hc IHealthcheck, opts ...func(*serverOptions)) (*Server, error)
- func (s *Server) Run(ctx context.Context) error
type Status

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func LiveHandler ¶ added in v0.7.0

func LiveHandler() http.HandlerFunc

LiveHandler return an implementation of /live request.

func NewBackground ¶ added in v0.4.0

func NewBackground(name string, initialErr error, delay, period, timeout time.Duration, fn CheckFn) *bgCheck

NewBackground will create a check that runs in background. Usually used for slow or expensive checks. Note: period should be greater than timeout.

hc, _ := healthcheck.New(...)
hc.Register(healthcheck.NewBackground("some_subsystem"))

func NewBasic ¶

func NewBasic(name string, timeout time.Duration, fn CheckFn) *basicCheck

NewBasic creates a basic check. This check will only be performed when RunAllChecks is called.

hc, _ := healthcheck.New(...)
hc.Register(healthcheck.NewBasic("postgres", time.Second, func(context.Context) error { ... }))

func NewManual ¶

func NewManual(name string) *manualCheck

NewManual create new check, that can be managed by client. Marked as failed by default.

hc, _ := healthcheck.New(...)
check := healthcheck.NewManual("some_subsystem")
check.SetError(nil)
hc.Register(check)
check.SetError(errors.New("service unavailable"))

func ReadyHandler ¶ added in v0.6.0

func ReadyHandler(healthcheck IHealthcheck) http.HandlerFunc

ReadyHandler build a http.HandlerFunc from healthcheck.

func WithCheckStatusFn ¶

func WithCheckStatusFn(fn func(checkID string, isReady Status)) func(*hcOptions)

WithCheckStatusFn will provide a function that will be called at each check changes.

func WithHealthcheck ¶

func WithHealthcheck(hc *Healthcheck) func(o *serverOptions)

func WithLogger ¶

func WithLogger(logger *slog.Logger) func(o *serverOptions)

func WithPort ¶

func WithPort(port int) func(o *serverOptions)

Types ¶

type Check ¶ added in v0.5.0

type Check struct {
	Name     string       `json:"name"`
	State    CheckState   `json:"state"`
	Previous []CheckState `json:"previous"`
}

type CheckFn ¶

type CheckFn func(ctx context.Context) error

type CheckState ¶ added in v0.5.0

type CheckState struct {
	ActualAt time.Time `json:"actual_at"`
	Status   Status    `json:"status"`
	Error    string    `json:"error"`
}

type Healthcheck ¶

type Healthcheck struct {
	// contains filtered or unexported fields
}

func New ¶

func New(opts ...func(*hcOptions)) (*Healthcheck, error)

func (*Healthcheck) Register ¶

func (s *Healthcheck) Register(ctx context.Context, check ICheck)

Register will register a check.

All checks should have a name. Will be better that name will contain only lowercase symbols and lodash. This is allowing to have the same name for Check and for metrics.

func (*Healthcheck) RunAllChecks ¶

func (s *Healthcheck) RunAllChecks(ctx context.Context) Report

RunAllChecks will run all check immediately.

func (*Healthcheck) Shutdown ¶ added in v0.8.0

func (s *Healthcheck) Shutdown()

Shutdown will disable all checks and set persistent marker that will immidiately return ready = false on all k8s requests. Shutdown should be called immideately after graceful shutdown process started.

type ICheck ¶

type ICheck interface {
	// contains filtered or unexported methods
}

type IHealthcheck ¶

type IHealthcheck interface {
	RunAllChecks(ctx context.Context) Report
}

type ILogger ¶

type ILogger interface {
	WarnContext(ctx context.Context, msg string, attrs ...any)
	ErrorContext(ctx context.Context, msg string, attrs ...any)
}

type Report ¶

type Report struct {
	Status Status  `json:"status"`
	Checks []Check `json:"checks"`
}

type Server ¶

type Server struct {
	// contains filtered or unexported fields
}

func NewServer ¶

func NewServer(hc IHealthcheck, opts ...func(*serverOptions)) (*Server, error)

func (*Server) Run ¶

func (s *Server) Run(ctx context.Context) error

type Status ¶

type Status string

const (
	StatusUp   Status = "up"
	StatusDown Status = "down"
)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
logr

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL