TinyGuide

A tiny highly experimental 0.6B coding agent coach that runs on your laptop. TinyGuide sits in Claude Code's PreToolUse hook before every Edit, Write, or Bash, it reads the session state and decides whether to say something. Most of the time it stays quiet. When the agent is about to edit a file it never read, re-run a command that already failed, or finalize without testing, it drops one short sentence of guidance into the agent's context.

Fine tuned from Qwen3-0.6B. Small enough to run locally on every tool call.

This repo holds the fp16 fused weights (bf16, ~1.1 GB) and the Claude Code hook that wires them in.

Run it

Serve with mlx_lm, then point the hook at it (infer.py POSTs to http://127.0.0.1:8080):

pip install mlx-lm
mlx_lm.server --model <this-repo-or-local-path> --port 8080

Direct generation:

from mlx_lm import load, generate
m, tok = load("<this-repo-or-local-path>")
print(generate(m, tok, prompt, max_tokens=24, temp=0.0))

Wire into Claude Code

Files in claude-code/:

File Role
hook_pretooluse.py the PreToolUse hook: rebuilds state from the transcript, asks the server, prints advisory context
infer.py client to the local mlx_lm.server; strips <think>, validates the hint
format_prompt.py shared train/inference prompt builder
build_from_raw.py transcript parser + state machine (iter_pairs, _advance)
clean_output.py KNOWN_HINTS + output validation
settings.snippet.json the PreToolUse config to merge into ~/.claude/settings.json

Merge settings.snippet.json into ~/.claude/settings.json, fixing the two absolute paths to your venv python and this folder. The hook fires on Edit|Write|Bash.

Runtime cooldown

Enforced at runtime in the hook via /tmp/tinyguide_cooldown.json:

  • MIN_CALLS_BETWEEN = 3 tool calls between hints
  • MAX_PER_SESSION = 5 cap per session

You can tune these in hook_pretooluse.py pacing is a policy.

Downloads last month
14
Safetensors
Model size
0.6B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bfuzzy1/TinyGuide

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(1019)
this model
Quantizations
1 model