thank you for the model

#22
by jarkevithwlad - opened

Hello, thank you for the model, I recommend studying this model (DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF) so that you can possibly adopt something from it that will suit you, in my opinion it has made a significant leap in quality

Hey, thank you for this β€” really, I appreciate you taking the time to point me to it. I sat down and read DavidAU's
card properly, and there's a lot of genuine craft in there. πŸ™

The way I read the stack: Heretic abliteration, an Unsloth finetune on his own Deckard/PDK sets, then the 27B→40B
passthrough expansion (64β†’96 layers), and finally a Claude-4.6-Opus distill pass to settle it back down β€” plus the
NEO/Di dual-imatrix MAX quants on top. And credit where it's due: a passthrough expansion normally perturbs the
residual stream and starts looping, so the fact that he heals it with that post-expansion distill pass is exactly the
right move. He's also honest on the card about the rep-pen tweaks and low-quant looping, which I respect a lot.

Couple of things I'd gently note, more as context than criticism: the imatrix parts (NEO/Di/MAX) are fidelity-recovery
tricks β€” they claw back quantization loss but can't really push past the BF16 source β€” and the headline MMLU-Pro/GPQA
numbers look like they're carried over from Qwen's base 27B rather than measured on the merge itself.

For me the honest blocker is just my setup: it's GGUF-only and ~80GB in BF16, so on a single 32GB card I can't heal or
QLoRA on top of it β€” and it's tuned for rich uncensored prose, which is a different lane from the agentic-coding work
I'm chasing. So I'll probably admire this one from a distance for now πŸ˜„. But the domain-matched dual-imatrix idea is
genuinely clever and something I'd love to play with myself down the line β€” full credit to David. Thanks again for
thinking of me with this!

#yuxinlu1

I have another question for you, if you don’t mind, have you thought about the qwen 3.6 9b model? It doesn’t exist, but as I understand it, it can be made. If I understood correctly, the 3.5 9b was made from the 3.5 27b by simply cutting off the excess and retraining the model. Maybe the same can be done with the 3.6 27b?
There are not enough new models on mobile devices of the quality that qwen 3.6 9b could provide

Sign up or log in to comment