Unified workload manager #2487

amirejaz · 2025-11-06T16:45:37Z

Summary

This PR implements a unified workload management system that provides a consistent interface for managing MCP server workloads across both CLI (Docker/Podman) and Kubernetes environments. This follows the same architectural pattern established by groups.Manager and enables platform-agnostic workload operations throughout ToolHive.

Motivation

Following the successful unification of group management, we now extend this pattern to workload management. This enables:

Consistent API: Same interface for workload operations regardless of runtime
Unified Discovery: Enables vmcp aggregator to discover backends from both CLI and Kubernetes workloads
Code Reusability: Platform-agnostic code can work with workloads without runtime-specific logic
Future-Proof: Easier to add new runtime environments or extend functionality

Implementation

Unified Manager Interface

The workloads.Manager interface provides a comprehensive set of operations for managing workloads:

Lifecycle Operations:

RunWorkload / RunWorkloadDetached - Start workloads in foreground or background
StopWorkloads - Stop running workloads
DeleteWorkloads - Remove workloads
RestartWorkloads - Restart workloads
UpdateWorkload - Update workload configuration

Query Operations:

GetWorkload - Retrieve workload details and status
ListWorkloads - List all workloads with optional filtering
ListWorkloadsInGroup - List workloads in a specific group
DoesWorkloadExist - Check workload existence

Utility Operations:

GetLogs / GetProxyLogs - Retrieve workload logs
MoveToGroup - Move workloads between groups

Platform-Specific Implementations

CLI Manager (cliManager)

Manages Docker/Podman containers
Uses filesystem-based storage (runconfig.json)
Supports full lifecycle operations
Handles container networking, secrets, and environment variables

Kubernetes Manager (k8sManager)

Manages MCPServer CRDs via Kubernetes API
Provides read operations and group management
Integrates with ToolHive operator for lifecycle management
Maps MCPServer CRDs to workload representation

Automatic Runtime Detection

The NewManager() factory function automatically detects the runtime environment:

Kubernetes mode: Returns k8sManager when TOOLHIVE_RUNTIME=kubernetes or running in a pod
CLI mode: Returns cliManager for Docker/Podman environments

Key Features

Group Integration

Workloads can be assigned to groups at creation time
ListWorkloadsInGroup enables group-based discovery
MoveToGroup allows reorganizing workloads
Seamless integration with groups.Manager

Unified Backend Discovery

vmcp aggregator can now discover backends from both CLI and Kubernetes workloads
Single BackendDiscoverer implementation works across platforms
Automatic health status mapping from workload status

Comprehensive Testing

Full unit test coverage for both implementations
Table-driven tests for all operations
Mock-based testing for isolation
Edge case and error handling coverage

Files Added

pkg/workloads/cli_manager.go - CLI implementation (1205 lines)
pkg/workloads/cli_manager_test.go - CLI tests (1616 lines)
pkg/workloads/k8s_manager.go - Kubernetes implementation (351 lines)
pkg/workloads/k8s_manager_test.go - Kubernetes tests (777 lines)

Files Modified

pkg/workloads/manager.go - Simplified to factory functions and interface definition
pkg/workloads/manager_test.go - Reduced to factory function tests
pkg/vmcp/aggregator/discoverer.go - Updated to use unified manager
cmd/vmcp/app/commands.go - Updated to use unified discoverer

Benefits

Consistency: Same API for workload operations across all environments
Maintainability: Clear separation between platform-specific and shared logic
Extensibility: Easy to add new runtime environments or operations
Testability: Each implementation can be tested independently
Integration: Enables unified features like vmcp backend discovery

Testing

All unit tests pass
Linting passes
Verified CLI workload operations (run, stop, delete, restart, logs)
Verified Kubernetes MCPServer operations (get, list, group operations)
Tested vmcp discovery with Kubernetes workloads
Verified group integration (ListWorkloadsInGroup, MoveToGroup)

Example Usage

// Automatically selects the right implementation based on runtime
manager, err := workloads.NewManager(ctx)
if err != nil {
return err
}

// Works the same way in CLI and Kubernetes
workloads, err := manager.ListWorkloadsInGroup(ctx, "engineering-team")
if err != nil {
return err
}

// Discover backends from workloads (used by vmcp)
discoverer := aggregator.NewBackendDiscoverer(manager, groupsManager, authConfig)
backends, err := discoverer.Discover(ctx, "engineering-team")

codecov · 2025-11-06T16:51:56Z

Codecov Report

❌ Patch coverage is 58.65385% with 344 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.25%. Comparing base (a5a0621) to head (480db30).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/workloads/cli_manager.go	47.85%	231 Missing and 61 partials ⚠️
pkg/workloads/manager.go	15.38%	20 Missing and 2 partials ⚠️
pkg/workloads/k8s_manager.go	90.75%	10 Missing and 6 partials ⚠️
pkg/vmcp/aggregator/discoverer.go	85.50%	8 Missing and 2 partials ⚠️
cmd/vmcp/app/commands.go	0.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2487      +/-   ##
==========================================
+ Coverage   55.01%   55.25%   +0.23%     
==========================================
  Files         292      294       +2     
  Lines       27904    28108     +204     
==========================================
+ Hits        15351    15530     +179     
- Misses      11144    11170      +26     
+ Partials     1409     1408       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dmjb

Marking this as "request changes" until we figure out how to split up this interface.

amirejaz added 2 commits November 6, 2025 16:00

unify workload management across CLI and Kubernetes

fb159b1

removed unnecessary files

480db30

amirejaz marked this pull request as draft November 6, 2025 16:45

dmjb requested changes Nov 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unified workload manager #2487

Unified workload manager #2487

Uh oh!

amirejaz commented Nov 6, 2025

Uh oh!

codecov bot commented Nov 6, 2025

Uh oh!

dmjb left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Unified workload manager #2487

Are you sure you want to change the base?

Unified workload manager #2487

Uh oh!

Conversation

amirejaz commented Nov 6, 2025

Summary

Motivation

Implementation

Unified Manager Interface

Platform-Specific Implementations

Automatic Runtime Detection

Key Features

Group Integration

Unified Backend Discovery

Comprehensive Testing

Files Added

Files Modified

Benefits

Testing

Related

Example Usage

Uh oh!

codecov bot commented Nov 6, 2025

Codecov Report

Uh oh!

dmjb left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants