simplevisor

package
v0.0.0-...-f825b5d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 4, 2025 License: MIT Imports: 13 Imported by: 0

Documentation

Overview

Package simplevisor provides a simple, lightweight process supervisor for managing long-running goroutines with automatic restart capabilities and graceful shutdown.

Overview

Simplevisor manages multiple named processes (goroutines) with configurable restart policies, panic recovery, and coordinated shutdown. It's designed for applications that need reliable background process management without complex dependencies.

Basic Usage

supervisor := simplevisor.New(5*time.Second, logger)

// Register a simple process
supervisor.Register("worker", func(ctx context.Context) error {
	for {
		select {
		case <-ctx.Done():
			return ctx.Err() // Graceful shutdown
		case work := <-workChan:
			processWork(work)
		}
	}
})

// Start all processes
supervisor.Run()

// Wait for shutdown signal and cleanup
supervisor.WaitOnShutdownSignal(func() {
	// Optional cleanup callback
	cleanupResources()
})

Restart Policies

Configure automatic restart behavior for processes:

// Never restart
supervisor.Register("one-shot", handler)

// Always restart (even on successful completion)
supervisor.Register("persistent", handler,
	simplevisor.WithRestart(simplevisor.RestartAlways, 5, 2*time.Second))

// Only restart on failures/panics
supervisor.Register("resilient", handler,
	simplevisor.WithRestart(simplevisor.RestartOnFailure, 3, 1*time.Second))

Panic Recovery

Handle panics in processes with custom recovery logic:

supervisor.Register("risky-process", riskyHandler,
	simplevisor.WithRecover(func(recovered interface{}) {
		log.Printf("Process panicked: %v", recovered)
		// Send alert, record metrics, etc.
	}))

Process Monitoring

Monitor process status during runtime:

// Check if a process is currently running
if supervisor.IsRunning("worker") {
	log.Println("Worker is active")
}

// Get detailed process status
status, err := supervisor.GetProcessStatus("worker")
if err != nil {
	log.Printf("Process not found: %v", err)
	return
}
switch status {
case simplevisor.StatusRunning:
	// Process is active
case simplevisor.StatusStopped:
	// Process has stopped (will restart based on policy)
case simplevisor.StatusRestarting:
	// Process is restarting after failure/completion
}

// Get total number of registered processes
count := supervisor.ProcessCount()

Graceful Shutdown

Simplevisor provides coordinated shutdown with timeout protection:

// Automatic shutdown on OS signals
supervisor.WaitOnShutdownSignal(nil)

// Manual shutdown
supervisor.Shutdown()

During shutdown: 1. All process contexts are cancelled 2. Processes should handle ctx.Done() and return gracefully 3. Supervisor waits for all processes to finish (with timeout) 4. Optional teardown callback is executed

Context-Based Cancellation

All processes receive a context for cancellation detection:

func workerProcess(ctx context.Context) error {
	ticker := time.NewTicker(1 * time.Second)
	defer ticker.Stop()

	for {
		select {
		case <-ctx.Done():
			// Cleanup and exit gracefully
			cleanup()
			return ctx.Err()
		case <-ticker.C:
			// Do periodic work
			doWork()
		}
	}
}

Thread Safety

Simplevisor is thread-safe for: - Process registration (before Run() is called) - Status queries (GetProcessStatus, IsRunning, ProcessCount)

However, the supervisor itself should be used from the main goroutine, particularly for Run() and WaitOnShutdownSignal().

Best Practices

1. Register all processes before calling Run() (duplicate names will panic) 2. Use context cancellation for graceful shutdown in process handlers 3. Set appropriate restart limits to prevent infinite restart loops 4. Use panic recovery for critical processes that must stay running 5. Keep process handlers lightweight and delegate heavy work to other goroutines 6. Always handle ctx.Done() in process main loops 7. Let the supervisor manage process lifecycles - no manual start/stop needed

Configuration Options

Restart Policy Options:

  • RestartNever: Process runs once and stops (default)
  • RestartAlways: Process restarts regardless of exit condition
  • RestartOnFailure: Process restarts only on errors or panics

Default Values:

  • Shutdown timeout: 5 seconds
  • Max restarts: 3
  • Restart delay: 1 second

Error Handling

Processes should return errors for failure conditions:

func databaseWorker(ctx context.Context) error {
	db, err := connectDB()
	if err != nil {
		return fmt.Errorf("failed to connect: %w", err)
	}
	defer db.Close()

	for {
		select {
		case <-ctx.Done():
			return ctx.Err()
		default:
			if err := processDBWork(db); err != nil {
				return fmt.Errorf("db work failed: %w", err)
			}
		}
	}
}

Returning an error will trigger restart behavior based on the configured policy.

OpenTelemetry Metrics

Simplevisor provides comprehensive OpenTelemetry metrics for monitoring:

supervisor := simplevisor.New(5*time.Second, logger)

// Enable metrics (optional)
if err := supervisor.EnableMetrics(); err != nil {
	log.Fatal(err)
}

Key metrics include: - simplevisor_processes_running: Currently running processes (UpDownCounter) - simplevisor_process_restart_count: Current restart count per process (Gauge) - simplevisor_process_status: Process status (Gauge: 1=running, 0=stopped, -1=restarting) - simplevisor_process_started_total: Process start events (Counter) - simplevisor_process_stopped_total: Process stop events by reason (Counter) - simplevisor_process_panics_total: Process panic events (Counter) - simplevisor_restart_limit_exceeded_total: Critical restart failures (Counter)

Metrics are automatically recorded when EnableMetrics() is called.

Logging

Simplevisor uses structured logging (slog) and logs: - Process start/stop events - Restart attempts with counts and delays - Panic recovery details - Shutdown progress and completion

Pass a custom logger to New() or use nil for default console output.

Package simplevisor is a simple supervisor. Supervisor Registers long-running processes. It runs long-running processes in go routine and handles the panic with function registered by process or default recover. Supervisor listen to shut-down signal and then runs all shutdown functions registered by processes.

Index

Constants

View Source
const (
	DefaultGracefulShutdownTimeout = 5 * time.Second
	DefaultRestartDelay            = 1 * time.Second
	DefaultMaxRestarts             = 3
	DefaultHealthyDuration         = 30 * time.Second // Duration to consider process healthy
	LogNSSupervisor                = "supervisor"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Metrics

type Metrics struct {
	// contains filtered or unexported fields
}

Metrics holds all OpenTelemetry metrics for the supervisor

type Option

type Option func(p *Process)

func WithRecover

func WithRecover(handler RecoverFunc) Option

WithRecover sets the recover handler for the process.

func WithRestart

func WithRestart(policy RestartPolicy, maxRestarts int, delay time.Duration) Option

WithRestart sets the restart policy for the process.

type Process

type Process struct {
	// contains filtered or unexported fields
}

type ProcessFunc

type ProcessFunc func(ctx context.Context) error

ProcessFunc is a long-running process which listens on context cancellation.

type ProcessStatus

type ProcessStatus int

ProcessStatus represents the current state of a process

const (
	StatusStopped ProcessStatus = iota
	StatusRunning
	StatusRestarting
)

func (ProcessStatus) String

func (s ProcessStatus) String() string

type RecoverFunc

type RecoverFunc func(r any)

RecoverFunc is a function to execute when a process panics.

type RestartPolicy

type RestartPolicy int

RestartPolicy defines when a process should be restarted

const (
	RestartNever     RestartPolicy = iota // Never restart the process
	RestartAlways                         // Always restart the process
	RestartOnFailure                      // Only restart on error/panic
)

func (RestartPolicy) String

func (r RestartPolicy) String() string

String methods for enums to provide readable metric labels

type Supervisor

type Supervisor struct {
	// contains filtered or unexported fields
}

Supervisor is responsible to manage long-running processes. Supervisor is not for concurrent use and should be used as the main goroutine of app.

func New

func New(shutdownTimeout time.Duration, sLog *slog.Logger) *Supervisor

New returns new instance of Supervisor.

func (*Supervisor) Context

func (s *Supervisor) Context() context.Context

func (*Supervisor) EnableMetrics

func (s *Supervisor) EnableMetrics() error

EnableMetrics initializes OpenTelemetry metrics for the supervisor. This is optional and should be called before registering processes for best results.

func (*Supervisor) GetProcessStatus

func (s *Supervisor) GetProcessStatus(name string) (ProcessStatus, error)

GetProcessStatus returns the current status of a process

func (*Supervisor) IsRunning

func (s *Supervisor) IsRunning(name string) bool

func (*Supervisor) ProcessCount

func (s *Supervisor) ProcessCount() int

func (*Supervisor) Register

func (s *Supervisor) Register(name string, handler ProcessFunc, options ...Option)

Register registers a new process to supervisor. Panics if the name isn't unique.

func (*Supervisor) Run

func (s *Supervisor) Run()

Run spawns a new goroutine for each process. Spawned goroutine is responsible to handle the panic.

func (*Supervisor) Shutdown

func (s *Supervisor) Shutdown()

Shutdown manually shuts down the supervisor goroutine

func (*Supervisor) WaitOnShutdownSignal

func (s *Supervisor) WaitOnShutdownSignal(teardown func())

WaitOnShutdownSignal wait to receive shutdown signal. WaitOnShutdownSignal should not be called in other goroutines except main goroutine of app. teardown is a callback function and will run at the last stage.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL