DiffControlNetLoaderDisTorch2MultiGPU

The DiffControlNetLoaderDisTorch2MultiGPU node is used to load Diffusers ControlNet models (HuggingFace Hub repositories) with DisTorch2 distributed tensor allocation, enabling advanced multi-device VRAM management to handle larger conditional generation models across multiple GPUs.

This node loads ControlNet models directly from HuggingFace model repositories by specifying the repository ID (e.g., "diffusers/controlnet-canny-sdxl-1.0").

Inputs

Parameter	Data Type	Description
`model_path`	`STRING`	The HuggingFace repository ID or local path of the diffusers ControlNet model to load.
`compute_device`	`STRING`	Target device for compute operations (e.g., 'cuda:0', 'cuda:1', 'cpu'). Selected from available devices on your system.
`virtual_vram_gb`	`FLOAT`	Amount of virtual VRAM in gigabytes to allocate for distributed tensor management (default: 4.0, range: 0.0-128.0).
`donor_device`	`STRING`	Device to donate VRAM from when allocating virtual memory (default: 'cpu').
`expert_mode_allocations`	`STRING`	Advanced allocation string for expert users to manually specify device/ratio distributions (e.g., 'cuda:0,50%;cpu,*').
`eject_models`	`BOOLEAN`	Whether to unload ALL models from the target device before loading this model, enabling deterministic model eviction for testing and memory management (default: true).

Outputs

Output Name	Data Type	Description
`CONTROL_NET`	`CONTROL_NET`	The loaded diffusers ControlNet model with DisTorch2 distributed allocation applied.

DisTorch2 Distributed Loading

DisTorch2 is an advanced memory management system that enables loading and running large diffusion models across multiple GPUs by intelligently distributing tensor allocations. Instead of loading an entire model on a single device, DisTorch2 splits the model's layers across available devices while maintaining computational efficiency.

Key Concepts

Virtual VRAM Allocation: Artificially increases the available VRAM on the compute device by borrowing memory capacity from donor devices through intelligent tensor distribution.

Expert Mode Allocations: Advanced users can manually specify exactly how much of the model should be placed on each device using ratio or byte-based allocation strings.

Allocation Examples

Basic Virtual VRAM Mode:

compute_device: cuda:0
virtual_vram_gb: 8.0
donor_device: cuda:1
Result: Loads model as if cuda:0 had 8GB more VRAM available, using cuda:1 as memory donor.

Expert Ratio Allocation:

expert_mode_allocations: cuda:0,60%;cuda:1,30%;cpu,10%
Distributes model layers with 60% on GPU 0, 30% on GPU 1, and 10% on CPU.

Expert Byte Allocation:

expert_mode_allocations: cuda:0,4gb;cuda:1,2gb;cpu,*
Allocates exactly 4GB to cuda:0, 2GB to cuda:1, and remaining to CPU.

Mixed Mode: Combines virtual VRAM with expert allocations for complex multi-device scenarios.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DiffControlNetLoaderDisTorch2MultiGPU

Inputs

Outputs

DisTorch2 Distributed Loading

Key Concepts

Allocation Examples

FilesExpand file tree

DiffControlNetLoaderDisTorch2MultiGPU.md

Latest commit

History

DiffControlNetLoaderDisTorch2MultiGPU.md

File metadata and controls

DiffControlNetLoaderDisTorch2MultiGPU

Inputs

Outputs

DisTorch2 Distributed Loading

Key Concepts

Allocation Examples