gemma-4-12B-coder-fable5-composer2.5-v1 — MLX 4.5 BPW

Mixed-precision MLX quantization of yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1, quantized with MLX Smart Quantize (MSQ) — my own sensitivity-based mixed-precision quantization method for Apple Silicon. It measures per-layer NMSE and assigns optimal bit widths automatically, combining architecture knowledge with measured data.

Details

  • Type: Vision (VLM)
  • Average: 4.5 bits per weight
  • Method: MLX Smart Quantize (MSQ)
  • AWQ scaling: applied to 96 groups
Downloads last month
395
Safetensors
Model size
12B params
Tensor type
F16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/gemma-4-12B-coder-fable5-composer2.5-v1-4bit-msq

Quantized
(214)
this model

Collection including mlx-community/gemma-4-12B-coder-fable5-composer2.5-v1-4bit-msq