FLUX.2-dev — INT8 tensorwise (+ConvRot), stock-ComfyUI native

Offline INT8 quantization of the black-forest-labs/FLUX.2-dev transformer, produced with Comfy-Org/comfy-quants (export-model-int8-tensorwise). Loads natively in stock ComfyUI >= v0.27.0 (QUANT_ALGOS["int8_tensorwise"], SM >= 7.5 / Turing+) — no custom node required.

Transformer-only (for models/diffusion_models/, load via UNETLoader); pair with the usual FLUX.2 text encoder (Mistral) and VAE.

Quantization contract

  • 160 quantized Linears (8 double blocks × img/txt attn qkv/proj + gated MLP, 48 single blocks × fused linear1/linear2); global modulation, io projections and final layer kept bf16; all 160 ConvRot-rotated (regular Hadamard, group 256).
  • Per layer: int8 weight + float32 [out,1] scale + comfy_quant marker {"format": "int8_tensorwise", "convrot": true, "convrot_groupsize": 256} (byte-exact to stock ComfyUI's save path).
  • Quant math bit-faithful to comfy-kitchen >= 0.2.15.

Measured (RTX PRO 6000 Blackwell, 1024², 20 steps, torch 2.10.0+cu130)

Image PSNR vs bf16: 32.6 dB — markedly better than FP8 E4M3 (20.1 dB) or NVFP4 (17.6 dB) on this trajectory-sensitive 32B distilled model; INT8+ConvRot's higher weight fidelity (SQNR ≈ 41 dB vs 31.5 dB for fp8) translates directly into image quality here. Disk 33.1 GB (vs 64 GB bf16).

License

Inherits the FLUX.2-dev Non-Commercial License from the base model.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LAXMAYDAY/FLUX.2-dev-int8-tensorwise

Finetuned
(29)
this model