yuxinlu1 commited on
Commit
e8a1d38
ยท
verified ยท
1 Parent(s): f09d355

Announcements: add pinned-discussion pointer (sampler/tool-parsing fixes); trim Q2_K note

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -34,11 +34,15 @@ technical tasks. The clearest signal is **tau2-bench `telecom`**, an agentic too
34
 
35
  ## ๐Ÿš€ Announcements
36
 
37
- **๐Ÿ“ฆ Q2_K is held back this release (please read).** The **Q2_K imatrix build is finished** โ€” but when I
38
- **stress-tested** it on long, complex generations (think a full interactive web page), it **still gave me real
39
- headaches**: it can wobble or fall apart on the harder stuff. **Q3_K_M holds up *far* better**, so for now I'm
40
- **not shipping Q2_K** with this release โ€” it'll only come back once I've ironed out *every* issue. For the smallest
41
- reliable option grab **Q3_K_M**; **Q4_K_M** is the recommended sweet spot. Huge thanks to everyone who flagged it. ๐Ÿ™
 
 
 
 
42
 
43
  **๐Ÿ”ฎ v3 is already on the way.** Honestly? Even *I* didn't expect the post-training jump to be **this** large โ€” so I'm
44
  pushing further. v3 keeps the **coding + agentic** focus and aims higher still. Stay tuned! ๐ŸŽ‰
 
34
 
35
  ## ๐Ÿš€ Announcements
36
 
37
+ **๐Ÿ“Œ Hitting a problem? Please check my pinned discussion first.** **~99% of issues are a client/sampler config, not
38
+ the weights** โ€” and they have a quick fix there. For example: garbled or **repeating `0000โ€ฆ`** output almost always
39
+ means **no repetition penalty** (set `rep_pen 1.1`, `temp 1.0`); and leaked `<|tool_call>` / `<|channel>` tokens mean
40
+ your front-end isn't parsing Gemma 4's **native tool format** (use llama.cpp `--jinja`). If your question isn't covered,
41
+ **don't hesitate to open a discussion** โ€” I read them and reply as fast as I can. ๐Ÿ’ฌ
42
+
43
+ **๐Ÿ“ฆ No Q2_K this release.** I finished a Q2_K (imatrix) build, but it didn't hold up under real stress-testing, so I'm
44
+ holding it back โ€” **I only ship a quant once I'm confident it's genuinely good.** Smallest reliable option is
45
+ **Q3_K_M**; **Q4_K_M** is the recommended sweet spot. ๐Ÿ™
46
 
47
  **๐Ÿ”ฎ v3 is already on the way.** Honestly? Even *I* didn't expect the post-training jump to be **this** large โ€” so I'm
48
  pushing further. v3 keeps the **coding + agentic** focus and aims higher still. Stay tuned! ๐ŸŽ‰