Krea-2-Turbo · MLX · 4-bit (mflux)

A 4-bit quantized MLX conversion of krea/Krea-2-Turbo, saved with mflux for fast local text-to-image generation on Apple Silicon.

Krea 2 is a single-stream MMDiT text-to-image model built on the Qwen-Image stack: it reuses the Qwen-Image VAE and conditions on a 12-layer hidden-state tap from a Qwen3-VL-4B text encoder. The Turbo variant is distilled and produces high-quality images in 8 steps.

Details


Base model	`krea/Krea-2-Turbo`
Format	MLX `safetensors` (sharded)
Quantization	4-bit
Saved with	mflux `0.18.0`
Pipeline	Text-to-image
Hardware	Apple Silicon (Metal / MLX)

This is a ready-to-run quantized snapshot, so it loads without re-quantizing at runtime. It contains the transformer, the Qwen3-VL-4B text encoder, the tokenizer, and the Qwen-Image VAE. At 4-bit it is the smallest snapshot (~7 GB), trading some fidelity for a lower memory footprint than the 8-bit build.

Usage

Install mflux:

pip install mflux

Generate from the local model directory:

mflux-generate-krea2 \
  --model /path/to/krea2-q4 \
  --prompt "a photograph of a red fox sitting in a sunlit forest clearing, sharp focus, bokeh" \
  --width 1024 \
  --height 1024 \
  --seed 42 \
  --steps 8

Turbo defaults: 8 steps, guidance 1.0 (CFG off), er_sde sampler. The plain flow-matching Euler sampler — which matches the official diffusers FlowMatchEulerDiscreteScheduler — is available via --scheduler euler.

Standard mflux CLI options are supported (--metadata, --stepwise-image-output-dir, multiple --seed values). Image conditioning (edit / reference) is not yet implemented.

Python API

from mflux.models.krea2 import Krea2

model = Krea2(model_path="/path/to/krea2-q4")
image = model.generate_image(
    seed=42,
    prompt="a photograph of a red fox sitting in a sunlit forest clearing, sharp focus, bokeh",
    num_inference_steps=8,
    width=1024,
    height=1024,
    guidance=1.0,
)
image.save("krea2_fox.png")

Architecture

Transformer: 28-layer single-stream MMDiT — hidden 6144, GQA (48 query / 12 KV heads, head_dim 128), SwiGLU, 3-axis Flux-style RoPE [32, 48, 48], per-head QK-norm + sigmoid-gated attention, AdaLN-single 6-way modulation, and a txtfusion adapter that fuses the 12 text-encoder hidden states.
Text encoder: Qwen3-VL-4B, 12-layer tap [2, 5, …, 35] flattened layer-major; the chat-template prefix is stripped so only prompt tokens condition the DiT.
VAE: Qwen-Image VAE (Wan2.1 16-channel latent).

License

This conversion inherits the license of the base model, krea/Krea-2-Turbo. Review and accept the original model's terms before use.

Acknowledgements

Krea for the original Krea-2-Turbo model
mflux for the MLX implementation and conversion tooling
MLX by Apple

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

2B params

Tensor type

BF16

U32

MLX

Hardware compatibility

Quantized

Model tree for MLXBits/krea-2-mlx-q4

Base model

krea/Krea-2-Raw

Finetuned

krea/Krea-2-Turbo

Finetuned

(5)

this model

Collection including MLXBits/krea-2-mlx-q4

Image Models

Collection

5 items • Updated about 18 hours ago