Abnormal result for qwen3 32b.

hqq worked all well with qwen3 until it came to the 32b one. Table below shows the result. Did you happen to run it before? 

Model | FP16 PPL | HQQ PPL | Increase | Pattern
-- | -- | -- | -- | --
0.6B | 26.16 | 32.22 | +6.06 
8B | 12.19 | 12.51 | +0.32 
14B | 10.78 | 11.09 | +0.31 
32B | 9.31 | 11.32 | +2.01 

Here is the cli to get ppl:
CUDA_VISIBLE_DEVICES=2,3 VLLM_WORKER_MULTIPROC_METHOD=spawn python -m lm_eval   --num_fewshot 0 --seed 42 --model vllm   --model_args pretrained=/models/Qwen3-32B-hqq/,dtype=half,tensor_parallel_size=2,enforce_eager=True,gpu_memory_utilization=0.8 --tasks wikitext   --batch_size 2


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Abnormal result for qwen3 32b. #169

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	FP16 PPL	HQQ PPL	Increase
0.6B	26.16	32.22	+6.06
8B	12.19	12.51	+0.32
14B	10.78	11.09	+0.31
32B	9.31	11.32	+2.01

Abnormal result for qwen3 32b. #169

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions