Skip to content

Conversation

@gflarity
Copy link
Contributor

@gflarity gflarity commented Nov 3, 2025

What type of PR is this?

Tests for gang scheduling as per the 'Grove Multi-Node QA Test Plan'.

What this PR does / why we need it:

Implements e2e tests for gang scheduling test scheduling.

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a API change?

NONE

Additional documentation e.g., enhancement proposals, usage docs, etc.:


@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from b0f7495 to 6ca3861 Compare November 3, 2025 19:17
Copy link

@shayasoolin shayasoolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the infra code (shared_cluster.go and setup.go) changes, and the first new test in gang_scheduling_test.go.
I stopped there since my main comment is that I strongly suggest to invest time on having as many utility test functions as possible, and then reuse them for the test cases themselves.
We'll not just gain reusability and make it easier to maintain, but the test cases themselves will get much clearer as they will show as high-level function calls that look ideally as the comment above the test case: a list of steps.
Since this requires a non-trivial refactoring, I prefer to review the test logic itself after it's done. Thanks!

@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from 94db608 to dfc20b3 Compare November 4, 2025 21:43
@gflarity
Copy link
Contributor Author

gflarity commented Nov 4, 2025

Thanks @shayasoolin. I abstracted as much as I could into utility functions. PTAL.

shayasoolin
shayasoolin previously approved these changes Nov 5, 2025
@gflarity
Copy link
Contributor Author

gflarity commented Nov 6, 2025

@shayasoolin @sanjaychatterjee updated to incorporate last suggestions. PTAL.

shayasoolin
shayasoolin previously approved these changes Nov 6, 2025
@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from ac591b6 to 52e7b92 Compare November 6, 2025 15:08
julienmancuso
julienmancuso previously approved these changes Nov 18, 2025
Copy link
Contributor

@julienmancuso julienmancuso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 nits, otherwise lgtm. Great job

@gflarity gflarity force-pushed the gflarity/e2e_tests_gang_scheduling branch from 049b9a6 to 4cf27c8 Compare November 19, 2025 20:03
@gflarity gflarity merged commit 0617444 into ai-dynamo:main Nov 20, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants