Releases
v0.7.0
Release SuperBench v0.7.0
Compare
Sorry, something went wrong.
No results found
SuperBench v0.7.0 Release Notes
SuperBench Improvements
Support non-zero return code when "sb deploy" or "sb run" fails in Ansible.
Support log flushing to the result file during runtime.
Update version to include revision hash and date.
Support "pattern" in mpi mode to run tasks in parallel.
Support topo-aware, all-pair, and K-batch pattern in mpi mode.
Fix Transformers version to avoid Tensorrt failure.
Add CUDA11.8 Docker image for NVIDIA arch90 GPUs.
Support "sb deploy" without pulling image.
Micro-benchmark Improvements
Support list of custom config string in cudnn-functions and cublas-functions.
Support correctness check in cublas-functions.
Support GEMM-FLOPS for NVIDIA arch90 GPUs.
Support cuBLASLt FP16 and FP8 GEMM.
Add wait time option to resolve mem-bw unstable issue.
Fix bug for incorrect datatype judgement in cublas-function source code.
Model Benchmark Improvements
Support FP8 in BERT model training.
Distributed Benchmark Improvements
Support pair-wise pattern in IB validation benchmark.
Support topo-aware, pair-wise, and K-batch pattern in nccl-bw benchmark.
You can’t perform that action at this time.