volcano-sh · hemantch01 · May 11, 2026
diff --git a/content/en/docs/plugins.md b/content/en/docs/plugins.md
@@ -173,4 +173,50 @@ The Numa-Aware Plugin aims to address these limitations.
 
 Common scenarios for NUMA-Aware are computation-intensive jobs that are sensitive to CPU parameters, scheduling delays. Such as scientific calculation, video decoding, animation rendering, big data offline processing and other specific scenes.
 
+### Usage
 
+#### Overview
+The Usage-based scheduling plugin evaluates actual real-time resource utilization (e.g., CPU, Memory) collected from monitoring systems like Prometheus instead of only depending on requested resources. It prevents new pods from being scheduled onto overloaded nodes and actively balances the cluster workload.
+
+#### Scenario
+Useful in clusters experiencing unbalanced node resource consumption where some nodes are overburdened while others remain idle despite having similar requested resources.
+
+### Rescheduling
+
+#### Overview
+The Rescheduling plugin periodically rebalances the cluster by evaluating real resource utilization. It actively evicts pods from heavily utilized nodes and shuffles them to under-utilized nodes based on configured target thresholds and strategies like LowNodeUtilization or OfflineOnly.
+
+#### Scenario
+Perfect for long-running clusters where dynamic workload lifecycles lead to fragmentation and resource imbalances over time.
+
+### ResourceQuota
+
+#### Overview
+The ResourceQuota plugin interfaces with Kubernetes' native `ResourceQuota` objects to ensure that a PodGroup is only enqueued if there is sufficient resource capacity in its namespace.
+
+#### Scenario
+Highly beneficial in multi-tenant environments to prevent jobs from entering the scheduling pipeline and clogging the queue when they have no chance of running due to namespace quota restrictions.
+
+### Pod Disruption Budget (PDB)
+
+#### Overview
+The PDB Plugin ensures that Volcano respects user-defined Kubernetes PodDisruptionBudget (PDB) constraints during any eviction-based scheduling actions, such as `reclaim`, `preempt`, and `shuffle`.
+
+#### Scenario
+Crucial for highly available workloads where simultaneous eviction of multiple replicas could result in service disruption.
+
+### Overcommit
+
+#### Overview
+The Overcommit Plugin allows the scheduler to artificially inflate the apparent "idle resources" of the cluster by a configurable factor (e.g., 1.2), permitting more jobs to enqueue in the scheduling pipeline than the physical capacity.
+
+#### Scenario
+Useful when administrators want the scheduler to tolerate a larger backlog of `pending` pods waiting for resources without rejecting them outright during peak loads.
+
+### DeviceShare
+
+#### Overview
+The DeviceShare Plugin provides a unified framework for sharing specialized hardware devices such as GPUs, NPUs, and FPGAs across multiple pods.
+
+#### Scenario
+Ideal for advanced AI/ML environments needing granular hardware sharing, like vGPU, vNPU, and GPU exclusive deployments.
diff --git a/content/en/docs/user_guide_how_to_use_deviceshare_plugin.md b/content/en/docs/user_guide_how_to_use_deviceshare_plugin.md
@@ -0,0 +1,53 @@
++++
+title = "DeviceShare Plugin"
+
+date = 2026-05-11
+lastmod = 2026-05-11
+
+draft = false  
+toc = true  
+type = "docs"  
+
+linktitle = "DeviceShare"
+[menu.docs]
+  parent = "user-guide"
+  weight = 4
++++
+
+## Introduction
+
+The **DeviceShare Plugin** is an advanced resource scheduling plugin in Volcano that provides a common framework for sharing specialized hardware devices (like GPUs, NPUs, FPGAs) across multiple pods. 
+
+Rather than implementing fragmented logic for each new hardware accelerator, Volcano exposes a unified `Devices` interface. The `deviceshare` plugin leverages this interface to perform robust allocation, node filtering, and resource tracking for shared devices.
+
+## Mechanism
+
+The DeviceShare plugin works in conjunction with device-specific implementations. It exposes standard scheduling operations such as `Predicate` (filtering nodes based on available device capacity) and `Allocate`/`Release` (assigning portions of a device to specific pods).
+
+Currently, the `deviceshare` plugin serves as the underlying engine powering features like:
+- **GPU Sharing**: Allowing multiple pods to request fractions of a single physical GPU's memory.
+- **vGPU and vNPU**: Virtualizing accelerator slices.
+- **GPU Exclusive**: Restricting a pod to exclusively own a GPU to avoid contention.
+
+## Configuration and Usage
+
+The `deviceshare` plugin is typically enabled implicitly when you enable device sharing predicates in the Volcano scheduler config map. However, if you are developing custom device sharing logic or need to explicitly declare it, it can be configured in your `volcano-scheduler-configmap`:
+
+```yaml
+actions: "enqueue, allocate, backfill"
+tiers:
+  - plugins:
+      - name: priority
+      - name: gang
+      - name: conformance
+      - name: deviceshare   # Enable the device share framework plugin
+  - plugins:
+      - name: overcommit
+      - name: drf
+      - name: predicates
+      - name: proportion
+      - name: nodeorder
+      - name: binpack
+```
+
+> **Note:** For specific guides on how to configure your workloads to request shared GPUs or NPUs, please refer to the dedicated guides for [GPU Sharing](../user_guide_how_to_use_gpu_sharing) and [vNPU](../user_guide_how_to_use_vnpu).
diff --git a/content/en/docs/user_guide_how_to_use_hcclrank_plugin.md b/content/en/docs/user_guide_how_to_use_hcclrank_plugin.md
@@ -0,0 +1,71 @@
++++
+title = "HCCLRank Plugin"
+
+date = 2026-05-11
+lastmod = 2026-05-11
+
+draft = false  
+toc = true  
+type = "docs"  
+
+linktitle = "HCCLRank"
+[menu.docs]
+  parent = "user-guide"
+  weight = 4
++++
+
+## Introduction
+
+In distributed AI training, particularly when using Ascend NPUs (Neural Processing Units) or MindSpore frameworks, the compute nodes need a deterministic rank or index to communicate over HCCL (Huawei Collective Communication Library).
+
+The **HCCLRank Plugin** is a Volcano Job plugin that automatically injects a `hccl/rankIndex` annotation into the Pods of a Volcano Job. It calculates a unique rank for each pod based on its task type (`master` or `worker`) and its replica index.
+
+## Mechanism
+
+During the Pod creation phase (`OnPodCreate`), the HCCLRank Plugin intercepts the pod and adds the `hccl/rankIndex` annotation to it.
+
+The calculation is as follows:
+- **Master Role**: Rank = Pod Index
+- **Worker Role**: Rank = (Total Master Replicas) + Pod Index
+
+If the Pod already has a `RANK` environment variable defined in its container specifications, the plugin will use that value instead and simply map it to the `hccl/rankIndex` annotation.
+
+## Configuration
+
+To enable the HCCLRank plugin, configure it within the Volcano job controller's configuration or add it to the `plugins` field of your `VolcanoJob` spec.
+
+```yaml
+apiVersion: batch.volcano.sh/v1alpha1
+kind: Job
+metadata:
+  name: ascend-distributed-training
+spec:
+  minAvailable: 3
+  schedulerName: volcano
+  plugins:
+    hcclrank:
+      - --master=master
+      - --worker=worker
+  tasks:
+    - replicas: 1
+      name: master
+      template:
+        spec:
+          containers:
+            - name: master
+              image: my-ascend-image
+    - replicas: 2
+      name: worker
+      template:
+        spec:
+          containers:
+            - name: worker
+              image: my-ascend-image
+```
+
+### Arguments
+
+The HCCLRank plugin supports overriding the default task names used to identify master and worker roles:
+
+- **`--master`**: The name of the master role task in your Job spec. Default is `master`.
+- **`--worker`**: The name of the worker role task in your Job spec. Default is `worker`.
diff --git a/content/en/docs/user_guide_how_to_use_overcommit_plugin.md b/content/en/docs/user_guide_how_to_use_overcommit_plugin.md
@@ -0,0 +1,56 @@
++++
+title = "Overcommit Plugin"
+
+date = 2026-05-11
+lastmod = 2026-05-11
+
+draft = false  
+toc = true  
+type = "docs"  
+
+linktitle = "Overcommit"
+[menu.docs]
+  parent = "user-guide"
+  weight = 4
++++
+
+## Introduction
+
+In typical cluster environments, the scheduler calculates available idle resources strictly based on physical node capacity minus allocated resources. However, when cluster resources are nearly fully utilized, many PodGroups are rejected from entering the scheduling pipeline and are left completely un-enqueued, which might not be desirable for scenarios where you want the scheduler to tolerate a larger backlog of `pending` pods.
+
+The **Overcommit Plugin** allows the scheduler to artificially inflate the apparent "idle resources" of the cluster by applying an `overcommit-factor`. This permits more jobs to be enqueued and wait in the scheduling pipeline than the physical resources might typically allow.
+
+## Mechanism
+
+The Overcommit plugin evaluates whether a job can be enqueued based on the requested `MinResources` of the PodGroup and the expanded idle resources.
+
+Expanded idle resource is calculated as:
+`Idle Resource = (Total Resource * overcommit-factor) - Used Resource`
+
+If the job's minimal requested resources can fit into this expanded idle resource pool, the job is permitted to be enqueued.
+
+## Configuration
+
+To use the Overcommit Plugin, add it to your `volcano-scheduler-configmap` under the `enqueue` tier, and provide an `overcommit-factor`.
+
+```yaml
+actions: "enqueue, allocate, backfill"
+tiers:
+  - plugins:
+      - name: overcommit  # Enable the overcommit plugin
+        arguments:
+          overcommit-factor: 1.2  # The overcommit factor. Default is 1.2
+      - name: priority
+      - name: gang
+      - name: conformance
+  - plugins:
+      - name: drf
+      - name: predicates
+      - name: proportion
+      - name: nodeorder
+      - name: binpack
+```
+
+### Arguments
+
+- **`overcommit-factor`**: A float value greater than or equal to `1.0`. For example, `1.2` means the scheduler will pretend the cluster has 20% more total resources when deciding whether to enqueue jobs into the pipeline. If a value less than `1.0` is provided, the plugin will automatically fallback to the default value of `1.2`.
diff --git a/content/en/docs/user_guide_how_to_use_pdb_plugin.md b/content/en/docs/user_guide_how_to_use_pdb_plugin.md
@@ -0,0 +1,55 @@
++++
+title = "Pod Disruption Budget (PDB) Plugin"
+
+date = 2026-05-11
+lastmod = 2026-05-11
+
+draft = false  
+toc = true  
+type = "docs"  
+
+linktitle = "Pod Disruption Budget"
+[menu.docs]
+  parent = "user-guide"
+  weight = 4
++++
+
+## Introduction
+
+When users deploy highly available jobs or applications on Volcano, they often need to limit the number of pod replicas that can be evicted or destroyed simultaneously to avoid downtime. This constraint is managed via Kubernetes **PodDisruptionBudget (PDB)** resources.
+
+The **PDB Plugin** ensures that Volcano respects user-defined PDB constraints during the scheduling process, specifically during eviction actions like `reclaim`, `preempt`, and `shuffle`.
+
+## Prerequisites
+
+- Your Kubernetes version must be 1.21 or later.
+- You must have created valid `PodDisruptionBudget` resources for your workloads.
+
+## Mechanism
+
+The PDB Plugin registers several functions (`ReclaimableFn`, `PreemptableFn`, and `VictimTasksFn`) under the `reclaim`, `preempt`, and `shuffle` actions. It maintains a cache of PDBs using `v1.PodDisruptionBudgetLister`. 
+
+During eviction scenarios, the plugin filters out tasks whose eviction would violate the configured PDB constraints. It calculates the `DisruptedPods` (pods whose eviction was processed but not yet observed by the PDB controller) and ensures the remaining available replicas satisfy the budget.
+
+## Configuration
+
+To enable the PDB Plugin, update the `volcano-scheduler-configmap` to include the `pdb` plugin in your configuration tiers.
+
+```yaml
+actions: "reclaim, preempt, shuffle"
+tiers:
+- plugins:
+  - name: pdb    # Enable the PDB plugin
+  - name: priority
+  - name: gang
+  - name: conformance
+- plugins:
+  - name: overcommit
+  - name: drf
+  - name: predicates
+  - name: proportion
+  - name: nodeorder
+  - name: binpack
+```
+
+*Note: The PDB plugin will be actively invoked when actions like `reclaim`, `preempt`, or `shuffle` are executed in the scheduler workflow.*
diff --git a/content/en/docs/user_guide_how_to_use_rescheduling_plugin.md b/content/en/docs/user_guide_how_to_use_rescheduling_plugin.md
@@ -0,0 +1,84 @@
++++
+title = "Rescheduling Plugin"
+
+date = 2026-05-11
+lastmod = 2026-05-11
+
+draft = false  
+toc = true  
+type = "docs"  
+
+linktitle = "Rescheduling"
+[menu.docs]
+  parent = "user-guide"
+  weight = 4
++++
+
+## Introduction
+
+Unbalanced resource utilization across a Kubernetes cluster often occurs due to unreasonable scheduling strategies, dynamic changes in job lifecycles, and node status changes (such as added/removed nodes or taint/affinity modifications).
+
+The **Rescheduling** plugin addresses these issues by actively rebalancing the cluster's resource utilization among nodes. It accomplishes this by evaluating real resource utilization (via Prometheus metrics) instead of merely the requested resource amounts, and it periodically evicts pods based on custom configured rescheduling strategies.
+
+## Rescheduling Workflow
+
+1. **Resource Filter**: Filters workloads which are eligible to be evicted based on queues or labels.
+2. **Strategy Evaluation**: Evaluates filtered workloads against the configured rescheduling strategies to determine which ones should be evicted.
+3. **Eviction**: Evicts the pods attached to the identified workloads.
+4. **Periodical Execution**: Executes the above process periodically.
+
+## Rescheduling Strategies
+
+Volcano's rescheduling plugin supports multiple strategies to select potential evictees:
+
+- **LowNodeUtilization**: Targets unbalanced nodes by evicting pods from highly utilized nodes and shuffling them to low utilized nodes based on configured target thresholds.
+- **OfflineOnly (OLO)**: Only selects offline workloads (annotated with `preemptable: true`) for rescheduling.
+- **LowPriorityFirst (LPF)**: Sorts workloads by priority and evicts lower priority pods first.
+- **ShortLifeTimeFirst (SLTF)**: Sorts workloads by running time. Pods with the shortest life time will be rescheduled first to ensure long-running workloads are not interrupted.
+- **BigObjectFirst (BOF)**: Selects workloads which request the most dominant resource and reschedules them first to improve system throughput and avoid small workloads starvation.
+- **MoreReplicasFirst (MRF)**: Sorts workloads by replica number. Workloads with the most replicas are rescheduled first, making it friendly to `gang` scheduling by considering `minAvailable`.
+
+## Configuration
+
+To enable the Rescheduling plugin, you must configure the `volcano-scheduler-configmap` by adding the `shuffle` action and configuring the `rescheduling` plugin within the tiers.
+
+```yaml
+actions: "enqueue, allocate, backfill, shuffle"  ## Add 'shuffle' action
+tiers:
+  - plugins:
+      - name: priority
+      - name: gang
+      - name: conformance
+      - name: rescheduling       ## Rescheduling plugin
+        arguments:
+          interval: 5m           ## Optional. Frequency at which the strategies are called. Default is 5m.
+          metricsPeriod: 5m      ## Optional. The duration of metrics to consider. Default is 5m.
+          strategies:            ## Required. Strategies to execute in order.
+            - name: offlineOnly
+            - name: lowPriorityFirst
+            - name: lowNodeUtilization
+              params:
+                thresholds:
+                  "cpu" : 20     ## Threshold below which a node is considered under-utilized
+                  "memory": 20
+                  "pods": 20
+                targetThresholds:
+                  "cpu" : 50     ## Target utilization to reach for balance
+                  "memory": 50
+                  "pods": 50
+          queueSelector:         ## Optional. Select workloads in specified queues as potential evictees. All queues by default.
+            - default
+            - test-queue
+          labelSelector:         ## Optional. Select workloads with specified labels as potential evictees. All labels by default.
+            business: offline
+            team: test
+  - plugins:
+      - name: overcommit
+      - name: drf
+      - name: predicates
+      - name: proportion
+      - name: nodeorder
+      - name: binpack
+```
+
+> **Note:** The rescheduling decisions consider metrics collected from Prometheus. Ensure your metrics configuration is correctly set up as it evaluates real node resource utilization instead of requested resource amounts.