first

#1
by jackyzhangjun - opened

waiting online๏ผŒto test multilanguages and agent tool calling etc...

Hey @jackyzhangjun โ€” thanks for waiting! ๐Ÿ™ Honest answer on both:

๐ŸŒ Multilingual: I'm genuinely not sure this round. I didn't get any Fable 5 Chinese data this time, and the only
non-English I had was a few German examples that weren't good enough, so I left them out of v2. So please go ahead and
try it โ€” but I can't promise it'll fully hold up across languages. If you do test it, I'd really love to hear what
you find.

๐Ÿ› ๏ธ Agent tool-calling: this one I'm confident about โ€” the improvement is clear and actually measured, not just a vibe.
I ran it on tau2-bench (telecom) and v2 scores ~55% vs ~15% , so multi-step tool use /
agentic work should feel noticeably stronger. Give it a spin and let me know! ๐Ÿš€

Sign up or log in to comment