Skip to content

Commit 562df9b

Browse files
committed
spell fix
1 parent aef0c7c commit 562df9b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

torchao_float8/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ As a first, and possibly only step, we use the GPT-Fast benchmark provided by To
3030

3131
## Torch Memory Profile
3232

33-
> **TLDR**: The dequantization of weights if FP8WeightsOnly config is not fused with GEMV computations. This leads to spike in GPU VRAM usage.
33+
> **TLDR**: The dequantization of weights in FP8WeightsOnly config is not fused with GEMV computations. This leads to spike in GPU VRAM usage.
3434
3535
The 0-th inference iteration is profiled using a CUDA memory snapshot. The snapshots are available at the following paths: `llama_benchmark/Meta-Llama-3.1-8B_None_torch_memory_profiler.pickle`, `llama_benchmark/Meta-Llama-3.1-8B_float8dq-tensor_torch_memory_profiler.pickle`, `llama_benchmark/Meta-Llama-3.1-8B_float8wo_torch_memory_profiler.pickle`.
3636

0 commit comments

Comments
 (0)