Skip to content

refactor: consolidate e2e tests and add unit tests#435

Open
k82cn wants to merge 8 commits intoxflops:mainfrom
k82cn:flm_test_0506
Open

refactor: consolidate e2e tests and add unit tests#435
k82cn wants to merge 8 commits intoxflops:mainfrom
k82cn:flm_test_0506

Conversation

@k82cn
Copy link
Copy Markdown
Contributor

@k82cn k82cn commented May 6, 2026

Summary

  • Consolidate e2e Python tests from 14 to 7 files for better maintainability
  • Add unit tests for session_manager storage layer, executor_manager, object_cache, and SDK

E2E Test Consolidation

New File Merged From
test_session.py test_session_management.py + test_open_session.py + test_batch.py
test_cache.py test_cache.py + test_cache_lru.py
test_runner.py test_runner.py + test_flmrun.py + test_get_data.py + test_drf.py
test_failure_recovery.py test_failure_recovery.py + test_shim_selection.py
test_core.py Kept as-is
test_agent.py Kept as-is
test_application.py Kept as-is

New Unit Tests

  • session_manager/src/storage/*_tests.rs - Storage layer unit tests
  • executor_manager/src/executor.rs - Executor unit tests
  • executor_manager/src/manager.rs - Manager unit tests
  • object_cache/src/cache.rs - Cache unit tests
  • sdk/python/tests/test_*.py - Python SDK tests
  • sdk/rust/src/client/mod.rs - Rust SDK tests

k82cn added 2 commits May 6, 2026 21:21
- Consolidate e2e Python tests from 14 to 7 files:
  - test_session.py: merged session_management, open_session, batch tests
  - test_cache.py: merged cache and cache_lru tests
  - test_runner.py: merged runner, flmrun, get_data, drf tests
  - test_failure_recovery.py: merged failure_recovery and shim_selection tests

- Add unit tests for session_manager storage layer
- Add unit tests for executor_manager
- Add unit tests for object_cache
- Add unit tests for SDK (Python and Rust)
Fix clippy field_reassign_with_default error in application_tests.rs
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several new end-to-end tests for the Flame system, covering LRU cache eviction policies, task failure recovery, session management, and shim selection logic. The review feedback highlights the need for more robust test configurations, such as using environment variable overrides for cache limits instead of relying on defaults, replacing hardcoded sleep intervals with polling mechanisms to reduce flakiness, and replacing magic numbers with constants for better maintainability.

Comment thread e2e/tests/test_cache.py
class TestLRUEviction:
"""Test suite for LRU eviction policy behavior."""

def test_basic_eviction_on_memory_limit(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test test_basic_eviction_on_memory_limit does not explicitly configure the cache memory limit, making it dependent on the environment's default configuration. Please use monkeypatch.setenv("FLAME_CACHE_MAX_MEMORY", "1M") to ensure the test consistently triggers eviction.

    def test_basic_eviction_on_memory_limit(self, monkeypatch):
        monkeypatch.setenv("FLAME_CACHE_MAX_MEMORY", "1M")

refreshed_task = session.get_task(task.id)

# Check events for error information
error_events = [e for e in refreshed_task.events if e.code == TaskState.FAILED or e.code == 3]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The magic number 3 is used to check for task failure. Please define a constant for this error code to improve maintainability.

Comment thread e2e/tests/test_session.py
task1 = session.create_task(serialize_request(request1))
task1_id = task1.id

time.sleep(3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoded time.sleep(3) can lead to flaky tests. Please replace this with a polling mechanism that checks for the expected state change.

Suggested change
time.sleep(3)
# Wait for task to be pending
for _ in range(10):
if session.get_task(task1_id).state == TaskState.PENDING:
break
time.sleep(0.5)

@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

k82cn added 6 commits May 6, 2026 21:49
The TestFlmrunApplication tests require /opt/e2e which only exists
in Docker E2E environment.
- Fix Rust SDK serde_message::deserialize to handle null values for
  common_data field by deserializing to Option<String>
- Export ResourceRequirement in flamepy package for Python E2E tests
…ility

The fairshare plugin's is_available check requires ssn.slots == exec.slots,
but the SDK sets slots=0 when resreq is provided. This causes sessions with
explicit resreq to never match any executor.

Skip this test until the fairshare plugin is updated to handle resreq-based
allocation.
- Remove fairshare scheduler plugin (slots-based allocation)
- Replace with DRF (Dominant Resource Fairness) for resreq-based allocation
- Update create_executor to use session's resreq directly when specified
- Update all config files to use drf instead of fairshare
- Update DEFAULT_POLICIES to [priority, drf, gang]
- Update tests to reflect new behavior (all executors available by default)
- Re-enable test_session_with_resreq test

This simplifies the scheduling model by using resource requirements directly
instead of the abstract slots concept. DRF provides fair multi-resource
allocation based on actual CPU/memory/GPU requirements.
- Regenerate Python protobuf files for types.proto changes (removed slots, resreq now required)
- Fix ruff lint: import sorting in agent/client.py
- Fix ruff lint: add noqa comments for mock method names matching protobuf interface
- Fix ruff format: 3 test files reformatted
- Remove backend.proto from Python SDK (internal use only)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant