Skip to content

Commit 6dae487

Browse files
authored
feat: Add USE_GCLOUD_STORAGE_RSYNC=1 to Cloud Batch Jobs (#5017)
### Descrição: This PR enables the `USE_GCLOUD_STORAGE_RSYNC=1` environment variable for all containers running as part of a Cloud Batch job. This change is intended to improve the performance and reliability of GCS operations within these jobs by enabling `gcloud storage rsync`. #### Context An investigation into the Cloud Batch infrastructure revealed that the environment variables for containers are not set via `user-data` scripts or instance templates. Instead, they are hardcoded in the `docker run` options string that the ClusterFuzz application builds when creating a Batch job. The `user-data` field in `batch.yaml` files is ignored, and the GCE instance templates defined in Terraform are only used for persistent bots, not for ephemeral Batch VMs. #### Changes This PR makes a single, targeted change to correctly inject the environment variable: - **Modified `src/clusterfuzz/_internal/google_cloud_utils/batch.py`**: - The `-e USE_GCLOUD_STORAGE_RSYNC=1` flag has been added to the `runnable.container.options` string within the `_get_task_spec` function.
1 parent 1bf69bb commit 6dae487

File tree

1 file changed

+1
-0
lines changed
  • src/clusterfuzz/_internal/google_cloud_utils

1 file changed

+1
-0
lines changed

src/clusterfuzz/_internal/google_cloud_utils/batch.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ def _get_task_spec(batch_workload_spec):
133133
'-e HOST_UID=1337 -P --privileged --cap-add=all '
134134
f'-e CLUSTERFUZZ_RELEASE={clusterfuzz_release} '
135135
'--name=clusterfuzz -e UNTRUSTED_WORKER=False -e UWORKER=True '
136+
'-e USE_GCLOUD_STORAGE_RSYNC=1 '
136137
'-e UWORKER_INPUT_DOWNLOAD_URL')
137138
runnable.container.volumes = ['/var/scratch0:/mnt/scratch0']
138139
task_spec = batch.TaskSpec()

0 commit comments

Comments
 (0)