You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 29, 2025. It is now read-only.
Code cleanups and support for tile resource and related telemetry
originated node labels.
Co-authored-by: Tuomas Katila <[email protected]>
Co-authored-by: Ukri Niemimuukko <[email protected]>
Copy file name to clipboardExpand all lines: gpu-aware-scheduling/README.md
+21-10Lines changed: 21 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# GPU Aware Scheduling
2
2
GPU Aware Scheduling (GAS) allows using GPU resources such as memory amount for scheduling decisions in Kubernetes. It is used to optimize scheduling decisions when the POD resource requirements include the use of several GPUS or fragments of GPUs on a node, instead of traditionally mapping a GPU to a pod.
3
3
4
-
GPU Aware Scheduling is deployed in a single pod on a Kubernetes Cluster.
4
+
GPU Aware Scheduling is deployed in a single pod on a Kubernetes Cluster.
5
5
6
6
**This software is a pre-production alpha version and should not be deployed to production servers.**
7
7
@@ -22,8 +22,9 @@ GAS tries to be agnostic about resource types. It doesn't try to have an underst
22
22
23
23
GAS heavily utilizes annotations. It itself annotates PODs after making filtering decisions on them, with a precise timestamp at annotation named "gas-ts". The timestamp can then be used for figuring out the time-order of the GAS-made scheduling decision for example during the GPU-plugin resource allocation phase, if the GPU-plugin wants to know the order of GPU-resource consuming POD deploying inside the node. Another annotation which GAS adds is "gas-container-cards". It will have the names of the cards selected for the containers. Containers are separated by "|", and card names are separated by ",". Thus a two-container POD in which both containers use 2 GPUs, could get an annotation "card0,card1|card2,card3". These annotations are then consumed by the Intel GPU device plugin.
24
24
25
-
GAS also expects labels to be in place for the nodes, in order to be able to keep book of the cluster GPU resource status. Nodes with GPUs shall be labeled with label name "gpu.intel.com/cards" and value shall be in form "card0.card1.card2.card3"... where the card names match with the intel GPUs which are currently found under /sys/class/drm folder, and the dot serves as separator. GAS expects all GPUs of the same node to be homogeneous in their resource capacity, and calculates the GPU extended resource capacity as evenly distributed to the GPUs listed by that label.
25
+
Along with the "gas-container-cards" annotation there can be a "gas-container-tiles" annotation. This annotation is created when a container requests tile resources (gpu.intel.com/tiles). The gtX marking for tiles follows the sysfs entries under /sys/class/drm/cardX/gt/ where the "cardX" can be any card in the system. "gas-container-tiles" annotation marks the card+tile combos assigned to each container. For example a two container pod's annotation could be "card0:gt0+gt1|card0:gt2+gt3" where each container gets two tiles from the same GPU. The tile annotation is then converted to corresponding environment variables by the GPU plugin.
26
26
27
+
GAS also expects labels to be in place for the nodes, in order to be able to keep book of the cluster GPU resource status. Nodes with GPUs shall be labeled with label name "gpu.intel.com/cards" and value shall be in form "card0.card1.card2.card3"... where the card names match with the intel GPUs which are currently found under /sys/class/drm folder, and the dot serves as separator. GAS expects all GPUs of the same node to be homogeneous in their resource capacity, and calculates the GPU extended resource capacity as evenly distributed to the GPUs listed by that label.
27
28
28
29
## Usage with NFD and the GPU-plugin
29
30
A worked example for GAS is available [here](docs/usage.md)
@@ -37,17 +38,18 @@ You should follow extender configuration instructions from the
37
38
use GPU Aware Scheduling configurations, which can be found in the [deploy/extender-configuration](deploy/extender-configuration) folder.
38
39
39
40
#### Deploy GAS
40
-
GPU Aware Scheduling uses go modules. It requires Go 1.16 with modules enabled in order to build. GAS has been tested with Kubernetes 1.22.
41
+
GPU Aware Scheduling uses go modules. It requires Go 1.17 with modules enabled in order to build. GAS has been tested with Kubernetes 1.22.
41
42
A yaml file for GAS is contained in the deploy folder along with its service and RBAC roles and permissions.
42
43
43
-
**Note:** If run without the unsafe flag a secret called extender-secret will need to be created with the cert and key for the TLS endpoint.
44
-
GAS will not deploy if there is no secret available with the given deployment file.
44
+
A secret called extender-secret will need to be created with the cert and key for the TLS endpoint. GAS will not deploy if there is no
|balancedResource| string | enable named resource balancing between GPUs | --balancedResource| ""
83
+
84
+
#### Balanced resource (optional)
85
+
GAS can be configured to balance named resources so that the resource requests are distributed as evenly as possible between the GPUs. For example if the balanced resource is set to "tiles" and the containers request 1 tile each, the first container could get tile from "card0", the second from "card1", the third again from "card0" and so on.
81
86
82
87
## Adding the resource to make a deployment use GAS Scheduler Extender
83
88
84
-
For example, in a deployment file:
89
+
For example, in a deployment file:
85
90
```
86
-
apiVersion: extensions/v1beta1
91
+
apiVersion: apps/v1
87
92
kind: Deployment
88
93
metadata:
89
94
name: demo-app
@@ -93,7 +98,7 @@ spec:
93
98
replicas: 1
94
99
selector:
95
100
matchLabels:
96
-
app: demo
101
+
app: demo
97
102
template:
98
103
metadata:
99
104
labels:
@@ -123,6 +128,12 @@ GAS Scheduler Extender is set up to use in-Cluster config in order to access the
123
128
Additionally GAS Scheduler Extender listens on a TLS endpoint which requires a cert and a key to be supplied.
124
129
These are passed to the executable using command line flags. In the provided deployment these certs are added in a Kubernetes secret which is mounted in the pod and passed as flags to the executable from there.
125
130
131
+
## License
132
+
133
+
[Apache License, Version 2.0](./LICENSE). All of the source code required to build the GPU Aware Scheduling is available under Open Source
134
+
licenses. The source code files identify external Go modules used. The binary is distributed as a container image on
135
+
[DockerHub](https://hub.docker.com/r/intel/gpu-extender). The container image contains license texts under folder `/licenses`.
136
+
126
137
## Communication and contribution
127
138
128
139
Report a bug by [filing a new issue](https://github.com/intel/platform-aware-scheduling/issues).
0 commit comments