Hello, I successfully installed flash-rl following your instructions, and finished the test to ensure the correctness.
Then I directly ran the script you provided
bash recipe/flash_rl/gsm8k_qwen0_5b_int8.sh flash-int8-TIS-2 2
The process hangs up after the ray instance is started:
2025-08-22 10:54:24,273 - numexpr.utils - INFO - Note: detected 384 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
2025-08-22 10:54:24,273 - numexpr.utils - INFO - Note: NumExpr detected 384 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
2025-08-22 10:54:24,273 - numexpr.utils - INFO - NumExpr defaulting to 16 threads.
DEBUG 08-22 10:54:24 [init.py:28] No plugins for group vllm.platform_plugins found.
DEBUG 08-22 10:54:24 [init.py:34] Checking if TPU platform is available.
DEBUG 08-22 10:54:24 [init.py:44] TPU platform is not available because: No module named 'libtpu'
DEBUG 08-22 10:54:24 [init.py:52] Checking if CUDA platform is available.
DEBUG 08-22 10:54:24 [init.py:72] Confirmed CUDA platform is available.
DEBUG 08-22 10:54:24 [init.py:100] Checking if ROCm platform is available.
DEBUG 08-22 10:54:24 [init.py:114] ROCm platform is not available because: No module named 'amdsmi'
DEBUG 08-22 10:54:24 [init.py:122] Checking if HPU platform is available.
DEBUG 08-22 10:54:24 [init.py:129] HPU platform is not available because habana_frameworks is not found.
DEBUG 08-22 10:54:24 [init.py:140] Checking if XPU platform is available.
DEBUG 08-22 10:54:24 [init.py:150] XPU platform is not available because: No module named 'intel_extension_for_pytorch'
DEBUG 08-22 10:54:24 [init.py:158] Checking if CPU platform is available.
DEBUG 08-22 10:54:24 [init.py:180] Checking if Neuron platform is available.
DEBUG 08-22 10:54:24 [init.py:187] Neuron platform is not available because: No module named 'transformers_neuronx'
DEBUG 08-22 10:54:24 [init.py:52] Checking if CUDA platform is available.
DEBUG 08-22 10:54:24 [init.py:72] Confirmed CUDA platform is available.
INFO 08-22 10:54:24 [init.py:239] Automatically detected platform cuda.
2025-08-22 10:54:26,762 - flash_rl.vllm_patch - DEBUG - Successfully patched the process_weights_after_loading function of vllm
2025-08-22 10:54:26,762 - flash_rl.vllm_patch - DEBUG - Successfully patched kv_cache process_weights_after_loading
2025-08-22 10:54:26,762 - flash_rl - DEBUG - Patching vllm process_weights_after_loading... status: True
2025-08-22 10:54:26,762 - flash_rl.vllm_patch - DEBUG - Successfully patched vllm
2025-08-22 10:54:26,762 - flash_rl - DEBUG - Patching the vllm LLM to enable flash_rl quantization... status: True
2025-08-22 10:54:27,353 - hydra.core.utils - DEBUG - Setting JobRuntime:name=UNKNOWN_NAME
2025-08-22 10:54:27,353 - hydra.core.utils - DEBUG - Setting JobRuntime:name=main_ppo
2025-08-22 10:54:34,306 INFO worker.py:1832 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
The experiments is conducted on 8 GPUs on one node, with vllm==0.8.4.
Is there any idea why the process keeps hanging up?
Hello, I successfully installed flash-rl following your instructions, and finished the test to ensure the correctness.
Then I directly ran the script you provided
bash recipe/flash_rl/gsm8k_qwen0_5b_int8.sh flash-int8-TIS-2 2The process hangs up after the ray instance is started:
2025-08-22 10:54:24,273 - numexpr.utils - INFO - Note: detected 384 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
2025-08-22 10:54:24,273 - numexpr.utils - INFO - Note: NumExpr detected 384 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
2025-08-22 10:54:24,273 - numexpr.utils - INFO - NumExpr defaulting to 16 threads.
DEBUG 08-22 10:54:24 [init.py:28] No plugins for group vllm.platform_plugins found.
DEBUG 08-22 10:54:24 [init.py:34] Checking if TPU platform is available.
DEBUG 08-22 10:54:24 [init.py:44] TPU platform is not available because: No module named 'libtpu'
DEBUG 08-22 10:54:24 [init.py:52] Checking if CUDA platform is available.
DEBUG 08-22 10:54:24 [init.py:72] Confirmed CUDA platform is available.
DEBUG 08-22 10:54:24 [init.py:100] Checking if ROCm platform is available.
DEBUG 08-22 10:54:24 [init.py:114] ROCm platform is not available because: No module named 'amdsmi'
DEBUG 08-22 10:54:24 [init.py:122] Checking if HPU platform is available.
DEBUG 08-22 10:54:24 [init.py:129] HPU platform is not available because habana_frameworks is not found.
DEBUG 08-22 10:54:24 [init.py:140] Checking if XPU platform is available.
DEBUG 08-22 10:54:24 [init.py:150] XPU platform is not available because: No module named 'intel_extension_for_pytorch'
DEBUG 08-22 10:54:24 [init.py:158] Checking if CPU platform is available.
DEBUG 08-22 10:54:24 [init.py:180] Checking if Neuron platform is available.
DEBUG 08-22 10:54:24 [init.py:187] Neuron platform is not available because: No module named 'transformers_neuronx'
DEBUG 08-22 10:54:24 [init.py:52] Checking if CUDA platform is available.
DEBUG 08-22 10:54:24 [init.py:72] Confirmed CUDA platform is available.
INFO 08-22 10:54:24 [init.py:239] Automatically detected platform cuda.
2025-08-22 10:54:26,762 - flash_rl.vllm_patch - DEBUG - Successfully patched the process_weights_after_loading function of vllm
2025-08-22 10:54:26,762 - flash_rl.vllm_patch - DEBUG - Successfully patched kv_cache process_weights_after_loading
2025-08-22 10:54:26,762 - flash_rl - DEBUG - Patching vllm process_weights_after_loading... status: True
2025-08-22 10:54:26,762 - flash_rl.vllm_patch - DEBUG - Successfully patched vllm
2025-08-22 10:54:26,762 - flash_rl - DEBUG - Patching the vllm LLM to enable flash_rl quantization... status: True
2025-08-22 10:54:27,353 - hydra.core.utils - DEBUG - Setting JobRuntime:name=UNKNOWN_NAME
2025-08-22 10:54:27,353 - hydra.core.utils - DEBUG - Setting JobRuntime:name=main_ppo
2025-08-22 10:54:34,306 INFO worker.py:1832 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
The experiments is conducted on 8 GPUs on one node, with vllm==0.8.4.
Is there any idea why the process keeps hanging up?