Skip to content

Conversation

@BuffMcBigHuge
Copy link

@BuffMcBigHuge BuffMcBigHuge commented Apr 15, 2025

This is a continuation of the work performed in ComfyUI native API integration with ComfyStream #59

Introduction:

ComfyStream now supports a new client-mode called "spawn" which automatically launches and manages ComfyUI instances directly from the server. This approach eliminates the dependency on the Hidden Switch fork while maintaining parallel processing capability for video frames.

Key Features:

1. Client-Mode Spawn

The server can now dynamically spawn and manage ComfyUI instances, eliminating the need for manual instance setup. This is controlled via the --client-mode spawn command-line argument, with additional parameters to control the number of worker instances (--workers). This is manually set for now (i.e. --workers 2) but in the future, can be workflow-dependent.

2. Improved Frame Management

  • Enhanced frame tracking with unique frame IDs to maintain proper ordering
  • Optimized buffer management to reduce dropped frames and provide smoother output
  • Dynamic output pacing to stabilize frame rates
  • Improved error handling for frame processing failures

3. Native ComfyUI API Integration

  • Direct communication with ComfyUI instances using WebSockets and the native REST API
  • Support for transmitting frames via Base64 encoding with efficient tensor conversion
  • Proper cleanup and resource management of spawned processes

Usage:

Key arguments:

  • --max-frame-wait: Maximum milliseconds to wait for a frame before dropping
  • --client-mode: Choose between "toml" (using config file) or "spawn" (spawn processes)
  • --log-level: Choose default log level: "DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"

Spawn Mode

This will auto-start subproccess ComfyUI native instances on UI workflow request, similar to EmbeddedClient. This is assuming your comfystream folder exists in ComfyUI/custom_nodes/comfystream.

python server/app_api.py --log-level INFO --max-frame-wait 1000 --workspace ../..  --workers 2 --cuda-devices 0,1 --workers-start-port 8195 --client-mode spawn

Spawn mode key arguments:

  • --workspace: Path to the ComfyUI installation directory
  • --workers: Number of ComfyUI instances to spawn per cuda device
  • --workers-start-port: The starting port that the ComfyUI workers will use
  • --cuda-devices: The available cuda devices to spawn workers on
  • --comfyui-log-level: Spawn instances can now return their logs to the console, only one setting: DEBUG

Spawn configuration:

With the spawn command above:

  • Specifying CUDA devices 0 and 1 (--cuda-devices 0,1)
  • Setting 2 workers per device (--workers 2)
  • Each worker will start on consecutive ports beginning with 8195 (--workers-start-port 8195)

This will create 4 workers in total:

  • 2 workers using CUDA device 0 on ports 8195 and 8196
  • 2 workers using CUDA device 1 on ports 8197 and 8198

Server (toml) Mode

This will connect to servers that are running and defined in comfy.toml.

python server/app_api.py --log-level INFO --max-frame-wait 1000 --workspace  ../.. --config-file configs/comfy.toml --client-mode toml

Server (toml) mode key arguments:

  • --config-file: Config file toml that defines available ComfyUI Instances

Server (toml) configuration:

When using TOML mode, define your server instances in the config file:

# Configuration for multiple ComfyUI servers
[[servers]]
host = "127.0.0.1" 
port = 8195
client_id = "client1"

[[servers]]
host = "127.0.0.1" 
port = 8196
client_id = "client2"

Benefits:

  1. Eliminates dependency on the Hidden Switch fork
  2. Works with standard ComfyUI nodes without modification
  3. Enables scalable multi-instance processing on single or multiple GPUs
  4. Streamlined setup and configuration

Limitations and Future Work:

  • Further optimization needed for tensor transfer efficiency
  • Exploration of multi-GPU scaling strategies
  • Integration with deployment automation tools

Known Issues

  • Refreshing the UI during an active stream will not properly continue inference when a new workflow is loaded unless the server is restarted This is now solved
  • Audio tensors not functional and have been deemed low priority

This implementation opens up new possibilities for advanced parallelization strategies while simplifying the overall architecture.

BuffMcBigHuge and others added 17 commits March 18, 2025 16:08
…ced uncessary base64 input frame operations, prep for multi-instance, cleanup.
…ame size handling, commented out some logging.
Co-authored-by: John | Elite Encoder <[email protected]>
…ediate step, moved prompt execution strategy to `execution_start` event, moved buffer to self variable to avoid reinitalization.
…to improve frame buffer, modified comfy arg handling.
@BuffMcBigHuge
Copy link
Author

Add two additional options to spawn mode:

--workers-start-port: The starting port that the ComfyUI workers will use
--cuda-devices: The available cuda devices to spawn workers on

Cleaned up PR documentation and readability.

@BuffMcBigHuge
Copy link
Author

Updates:

  • Merged upstream
  • Fixed issue with cleanup not properly resetting the clients for subsequent runs.
  • Better error handling for Comfy instances via spawn, reorganization of app, pipeline and config files.

…cations to spawning instances, better handling of misconfigured workspace.
@BuffMcBigHuge
Copy link
Author

Updates:

  • Modifications to logging system
  • Handle case where workspace directory is misconfigured
  • Added better spawned Comfy instance logging using --comfyui-log-level DEBUG argument.

@BuffMcBigHuge
Copy link
Author

We have tested spawn on Runpod with varied results on frame management with > 2 workers. More to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant