Release SuperBench v0.7.0

abuccts released this 20 Jan 05:04

d76e4e1

SuperBench v0.7.0 Release Notes

SuperBench Improvements

Support non-zero return code when "sb deploy" or "sb run" fails in Ansible.
Support log flushing to the result file during runtime.
Update version to include revision hash and date.
Support "pattern" in mpi mode to run tasks in parallel.
Support topo-aware, all-pair, and K-batch pattern in mpi mode.
Fix Transformers version to avoid Tensorrt failure.
Add CUDA11.8 Docker image for NVIDIA arch90 GPUs.
Support "sb deploy" without pulling image.

Micro-benchmark Improvements

Support list of custom config string in cudnn-functions and cublas-functions.
Support correctness check in cublas-functions.
Support GEMM-FLOPS for NVIDIA arch90 GPUs.
Support cuBLASLt FP16 and FP8 GEMM.
Add wait time option to resolve mem-bw unstable issue.
Fix bug for incorrect datatype judgement in cublas-function source code.

Model Benchmark Improvements

Support FP8 in BERT model training.

Distributed Benchmark Improvements

Support pair-wise pattern in IB validation benchmark.
Support topo-aware, pair-wise, and K-batch pattern in nccl-bw benchmark.

Assets 2