ToneBridge: Fine-Tuning a Small Mandarin Coach for Context-Aware Corrections

Community Article

Published June 14, 2026

Upvote

ZHAO JINGZI

ZhaoJingzi

build-small-hackathon

Adrien CHARMASSON

Alphaplasti

build-small-hackathon

ToneBridge is a context-aware Mandarin correction coach built for the Hugging Face Build Small Hackath

The goal is simple: when a Chinese learner writes a short Mandarin sentence, ToneBridge helps correct it without turning it into a completely different sentence. Instead of acting like a translator, it behaves more like a sentence coach: it preserves the learner’s intention, adapts the sentence to the chosen context, explains the change in English, adds pinyin, and reads the corrected sentence aloud.

The project started from a very specific language-learning problem. For Mandarin learners, a sentence can be grammatically understandable but still feel unnatural, too formal, too casual, badly ordered, or not adapted to the social situation. General-purpose chatbots can help, but they often rewrite too much, require repeated prompting, and do not create a fast practice loop.

ToneBridge was built to make that loop smaller, faster, and more focused.

What We Built

ToneBridge takes three main inputs:

A short Mandarin sentence written by the learner.

A communication context, such as friendly informal, work formal, polite stranger, or WeChat informal.

The learner’s need: correction, explanation, and pronunciation support.

The app returns:

a corrected Mandarin sentence with Pinyin;

a simple English explanation;

a practical learning tip;

audio playback;

feedback buttons for rating the correction.

The core correction model is a fine-tuned MiniCPM4.1-8B model:

[Alphaplasti/ToneBridge-MiniCPM4.1-8B](https://huggingface.co/Alphaplasti/ToneBridge-MiniCPM4.1-8B)

The model stays under the 32B hackathon limit and focuses on one narrow behavior: context-aware Mandarin sentence correction.

ToneBridge is not designed to replace a teacher, a translator, or a full language-learning course. It is designed for one concrete moment: when a learner asks, “Would this sentence actually sound right in this situation?”

Product Constraint: Correct Less, Preserve More

One of the most important product decisions was not to make the model “more creative.”

For language learning, a fluent rewrite is not always a better answer. If the model rewrites the whole sentence, the learner may get a polished result but lose the connection with their original attempt. That is not ideal for practice.

So we defined the correction behavior around three constraints:

preserve the learner’s meaning;
make the smallest useful correction;
adapt tone and word choice to the selected context.

This product constraint shaped the model work, the dataset design, and the final UI.

First Prototype: General Chinese-Capable Models

The first prototype used general Chinese-capable models for correction and TTS.

The initial loop was:

Mandarin input
→ context selection
→ model correction
→ pinyin
→ English explanation
→ audio playback

This proved that the product idea worked inside a Hugging Face Space. The model could catch grammar mistakes, the app could show pinyin and explanations, and the learner could hear the corrected sentence.

But after testing, we saw three limits:

Grammar correction worked better than context correction.
The model sometimes rewrote too aggressively.
TTS latency could break the learning flow.

The problem was not whether a general model could correct Mandarin. The problem was whether it could consistently behave like a conservative, context-aware Mandarin coach.

That led us to test OpenBMB models and then fine-tune our own correction model.

Model Choice: Why MiniCPM4.1-8B

We tested OpenBMB models during the hackathon because they matched the “small but capable” spirit of the challenge.

MiniCPM4.1-8B was a good candidate for ToneBridge because:

it is small enough for the hackathon constraint;
it has strong multilingual and Chinese capabilities;
it is compact enough to make task-specific fine-tuning meaningful;
it allowed us to focus on a narrow correction behavior instead of using a larger general-purpose model.

In early tests, MiniCPM4.1-8B behaved similarly to the previous general model: useful, but not yet specific enough. It could correct many sentences, but it did not always understand the difference between grammar correction and context-aware coaching.

That was exactly the gap fine-tuning could address.

Fine-Tuning Strategy

We did not fine-tune the model in one step. We used two focused fine-tuning phases.

Phase 1 — Sentence Construction

The first fine-tuning phase focused on sentence construction mistakes.

The target errors included:

unnatural word order;
missing or misplaced sentence components;
awkward but understandable learner sentences;
corrections that should preserve the original meaning.

The first dataset started with 200 seed examples of awkward Mandarin sentences and corrected versions. These examples were reviewed by a native Chinese speaker to check:

whether the correction was natural;
whether the learner’s meaning was preserved;
whether the example represented a realistic learner mistake.

After validation, the seed set was expanded into a larger dataset of 20,000 examples.

This first fine-tuned model improved structure correction, but it still did not fully solve contextual appropriateness.

Phase 2 — Context-Aware Correction

The second fine-tuning phase focused on context.

This dataset was designed around cases where the sentence was not only grammatically wrong, but socially or situationally inappropriate.

Each example included:

{
  "original\_sentence": "...",
  "context": "...",
  "corrected\_sentence": "...",
  "reason": "..."
}

The target behavior was more precise:

correct the sentence according to context;
avoid over-polishing;
preserve the learner’s intention;
explain why the correction fits better;
distinguish grammar mistakes from tone/register/context mistakes.

After review, we generated a second dataset of 5,000 examples and trained another version of the model.

This second fine-tune changed the product quality significantly. The model became more conservative, more useful for context mistakes, and better aligned with the intended coaching behavior.

Dataset Design

https://huggingface.co/datasets/Alphaplasti/tonebridge-metrics

The dataset was not treated as a generic “Chinese correction” dataset. It was designed around product behavior.

The main fields were:

Field	Purpose
`original\_sentence`	The learner’s input, including realistic mistakes
`context`	The communication situation
`corrected\_sentence`	The minimal useful correction
`reason`	The explanation behind the correction
`error\_type`	The type of mistake: structure, word order, tone, context, etc.

The most important part was not only generating examples, but checking whether the examples expressed the behavior we wanted the model to learn.

A synthetic dataset can scale quickly, but it can also repeat mistakes quickly. For this reason, we used human review on seed examples and random validation before training.

Product Architecture

The final app separates the correction model from the surrounding learning experience.

User sentence + context
        ↓
Fine-tuned MiniCPM4.1-8B correction model
        ↓
Structured correction output
        ↓
Pinyin + English explanation + learning tip
        ↓
Audio playback
        ↓
User feedback + metrics logging

The main components are:

Component	Choice
Space UI	Custom `gr.Server` frontend
Correction model	Fine-tuned MiniCPM4.1-8B
Model ID	`Alphaplasti/ToneBridge-MiniCPM4.1-8B`
Training infra	Modal
TTS	Edge TTS for speed and reliability
Feedback	User rating buttons
Metrics	Generation time, context, error type, model ID, feedback

The custom gr.Server interface allowed us to move beyond a default demo layout. ToneBridge needed to feel like a small learning product, not only a model playground. The UI was designed around a fast practice loop: input, correction, explanation, listen, retry.

Why We Replaced the TTS

We tested OpenBMB voice generation and the quality was strong. But for this product, the main constraint was not only voice quality. It was interaction speed.

In a sentence practice tool, audio is part of the feedback loop. If the learner has to wait too long to hear the corrected sentence, the rhythm of practice breaks.

So we moved TTS outside the core model path and used Edge TTS for faster playback. This allowed the GPU budget and model focus to remain on the correction task.

The key product decision was:

Use the fine-tuned model where task-specific language behavior matters.
Use a faster TTS layer where interaction latency matters more.

Feedback and Metrics

ToneBridge also logs feedback and generation metadata.

The app records:

selected context;
error type;
model ID;
generation time;
user rating;
correction output.

This creates a lightweight improvement loop. Instead of treating model output as the end of the product, ToneBridge treats user feedback as part of the system.

In future iterations, this feedback can help identify:

which contexts are weakest;
which types of mistakes need more data;
where latency affects the user experience;
which corrections users find helpful or unhelpful.

What Changed After Fine-Tuning

Before fine-tuning, the model could correct Mandarin sentences, but it behaved like a general assistant.

After fine-tuning, it behaved more like the product we wanted:

less aggressive rewriting;
better preservation of learner intent;
clearer correction reasons;
stronger handling of context and tone;
more useful phrase construction corrections.

The biggest improvement was not simply “better Chinese.” It was better alignment with the product behavior.

For ToneBridge, the model needed to understand that the best correction is often not the most elegant sentence. It is the smallest sentence that helps the learner say what they meant, in the right context.

What We Learned

The main lesson from ToneBridge is that small models become much more useful when the task is clearly defined.

A general model can help with language correction, but a fine-tuned small model can better match a specific learning flow:

short sentence
→ selected context
→ minimal correction
→ explanation
→ audio
→ feedback

Fine-tuning was not only a technical step. It was a product design step. The dataset had to describe the behavior we wanted. The examples had to encode the difference between grammar correction, context correction, and tone adaptation.

We also learned that an AI app is more than the model. Latency, UI, feedback, and explanation quality all shape whether users actually want to practice.

What Is Next

The next direction for ToneBridge is to improve the feedback loop and expand the correction dataset.

Possible next steps include:

collecting more real learner mistakes;
separating beginner, intermediate, and advanced correction modes;
improving context categories;
adding more structured evaluation examples;
using feedback ratings to guide future dataset updates;
connecting the correction flow to a playful companion mode, such as Reachy Mini, to make practice more interactive.

ToneBridge was built for a small problem, but that small problem turned out to be deep: helping a learner move from “this sentence is understandable” to “this sentence sounds right here.”

That is the bridge we wanted to build.

All Relevant Links

Live Space: ToneBridge on Hugging Face Spaces Fine-tuned model: Alphaplasti/ToneBridge-MiniCPM4.1-8B Dataset: Alphaplasti/tonebridge-metrics Demo video: ToneBridge Demo on YouTube X / Twitter post: ToneBridge announcement on X LinkedIn: https://www.linkedin.com/posts/jingzizhao_tonebridge-a-hugging-face-space-by-build-small-hackathon-share-7471909868619206656-rpF2/?utm_source=share&utm_medium=member_desktop&rcm=ACoAAEhuC3wBuxIz2RG3eXfMGjg0-IcyvdK-wsk Hackathon Diary #1: Build Small Hackathon Diary #1 Hackathon Diary #2: Build Small Hackathon Diary #2

Models mentioned in this article 1

Datasets mentioned in this article 1

Spaces mentioned in this article 1

Signal Garden: A Game Engine That Keeps Mutating

June 16, 2026

Noteworthy

June 15, 2026

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote