Sehyo
Sehyo/Qwen3.5-122B-A10B-NVFP4
No description available.
Model Documentation
Qwen3.5-122B-A10B-NVFP4
This is a quantized version of Qwen/Qwen3.5-122B-A10B using the NVFP4 quantization scheme.
Please use nightly vLLM for support.
Changelog
Calibration
train_sft split)chat split)moe_calibrate_all_experts=TrueCreation
This model was created using VLLM's LLM Compressor with Qwen3.5 MoE support added via PR #2383. The PR adds a custom
CalibrationQwen3MoeSparseMoeBlock that routes calibration data to all experts during quantization, ensuring every expert receives proper calibration for accurate NVFP4 quantization.Files & Weights
| Filename | Size | Action |
|---|---|---|
| extra_weights.safetensors | 4.70 GB | |
| model-00001-of-00002.safetensors | 46.58 GB | |
| model-00002-of-00002.safetensors | 24.61 GB |