Field Notes: I Built a Radio That Picks Up Lost Universes
You keep turning. 103.0: a weather report for Jupiter. 100.8: a commercial for renting clouds, "lightly used, mild precipitation included." 89.0: a chef serving a dish she swears is named after you, even though she's never met you and this is the first time you've tuned in.
Come back tomorrow and the chef has moved. The station you found at 98.7 is broadcasting from 88.6 now. The universe drifts on the dial, but the voices in it stay exactly who they were.
Somewhere around 104.7 the warmth disappears. A flat, mechanical voice starts reading numbers in Morse. It doesn't want to talk to you, and it tells you so. But if you listen closely, it hands you the way past it twice over, without ever realizing it's doing it.
Get past it and the dial does something it shouldn't be able to do. It grows. New frequencies appear above 108, places that weren't there five minutes ago, and what's broadcasting from them finally explains why a cat's chess match, a Jovian weather report, and a confused chef all felt like they were happening in the same place. They were. Every station you found was a room in the same house, one universe among many, and that universe is quietly losing power right now, tonight, while you listen.
That's Lost Frequency Radio. Everything you just read was written by a model with one billion parameters.
Turn the dial yourself: Live demo · 2-minute video · The fine-tuned model · The dataset
What I found out about 1B models
I went into this expecting to fight the model the whole way. A billion parameters felt like a hard ceiling, and I figured the post-mortem would be me apologizing for the rough edges. That's not how it went.
This model holds a dozen distinct characters and keeps them straight across a session. It writes in Spanish and English without bleeding one into the other. It does deadpan comedy, it nails the cadence of a 1950s sports broadcast, it plays a cipher operator who is withholding information and leaks the way past it anyway. It improvises surreal premises and keeps them internally consistent for the length of a transmission. I can run all of it with the wifi off.
The capability was already sitting in there. When I gave it no direction, it gave me the average of everything it had ever read, and that's the version that feels dull. The moment I pointed it at one specific thing and trained it properly, the ceiling jumped far past where I'd set my expectations and I loved it.
I gave myself one rule going in: nothing leaves the machine. No cloud APIs, no hosted inference, no keys. Every word runs locally through the llama.cpp runtime, on a model small enough that "the model in front of you" is just literally true. You could unplug the ethernet cable mid-broadcast and nothing would change. I wanted to see how far that could go, and it went further than I thought it would.
How I built it
Teaching it to stop narrating and start performing.
I started from openbmb/MiniCPM5-1B. Out of the box it's an assistant, and assistants describe things. Ask for "a 1950s radio broadcast" and you get a paragraph about a broadcast, stage directions, a friendly "Sure, here's a transmission for you!" The capability to write the broadcast is there. It's just buried under the reflex to explain instead of do.
So I trained that reflex out. I wrote close to 800 short transmissions, by hand and by script, in Spanish and English, each tagged with the markers the frontend reads: [JINGLE], [INTERFERENCIA], [CORTE COMERCIAL], [FIN DE TRANSMISION]. I gave the stations a recurring cast, Don Aurelio and his chess cats, Doña Carmenza and the Jupiter weather, a chef who has completely lost the plot, and let the pieces recombine night after night. This is where small models are genuinely fun to work with: 800 examples is nothing to a frontier model, but to this one it's enough to reshape how it talks. One person, a weekend, a real change in behavior. You don't get that kind of direct grip on something with a hundred billion parameters.
The detail I'm proudest of is what I left out of the prompt. There's no line in there saying "write only the script, 60 to 90 words, stay in character." At this size, instructions like that come back out on air. Ask for 60 to 90 words and the model broadcasts the literal string "[60-90 words]," because it's matching the shape of your prompt instead of following it. So I never wrote the rule. I taught the format with examples until it was the only thing the model knew how to produce. Working with a small model forces you to understand what it's actually doing token by token, and you end up with something more deliberate than you'd have built if you could afford to be sloppy.
I also wanted to know where the ceiling really was, so I pushed it: two languages on one billion parameters. Spanish and English, side by side, and it kept them apart cleanly, no bleeding one into the other. That was the part I most expected to fail, and it just worked.
Training was LoRA, rank 16, on an RTX 4050 with 6 GB of VRAM, three epochs in bf16 with gradient checkpointing so it fit. Then I merged the adapter, exported to GGUF, and quantized to Q4_K_M, which is what lets it generate at a comfortable pace on CPU with no GPU in the loop at inference. The fine-tune is on the Hub at build-small-hackathon/MiniCPM5-1B-lost-frequency-radio-GGUF, and the app pulls it down on first run, with the base model as a fallback. The whole training run is something you could repeat on a laptop in an evening, which is most of the reason I think more people should be doing this.
Same frequency, same station, every time.
If two people tune to 96.0 MHz, they should both land on the announcer. Two unrelated scenes and it stops being a place and turns into a slot machine. So nothing is random. Every station is seeded from its frequency: frequency becomes a seed becomes a station, the same way for everyone. That's also why the dial can drift between sessions while the station stays itself, the mapping moves, the seed doesn't.
Two ways to listen.
A 1B model on a plain CPU takes a moment to write a broadcast, and a blank screen reads as broken to a first-time listener. So the radio has a MODE switch on the front panel. In FAST mode, broadcasts arrive instantly from a pool the model wrote ahead of time, several editions per station, so different listeners still hear slightly different wordings. Flip it to LIVE and you watch the model write a fresh one for you, token by token, with a little "capturing signal" bar while it thinks. Either way it is the same fine-tuned model running locally, and the number-station operator, the part that answers whatever you type, is always live. I wanted people to get the instant, magical version first, and to be able to lift the hood and watch the AI actually working whenever they wanted.
Getting out of Gradio's shadow.
The hackathon wants a Space, and a Space usually means the default Gradio look. I wanted a radio with a wooden bezel and a CRT that breathes, not sliders and a chat box. gr.Server is the way out: it hands you a real FastAPI app instead of a prebuilt interface. I mounted my own static files, served a hand-built index.html, and streamed every broadcast over SSE to a frontend I wrote from scratch, with a tuning dial that stays exactly where you leave it, a live oscilloscope drawn off a Web Audio analyser, actual static, Morse code, and a different synthesized voice for each station. From the outside there's no Gradio in sight. Underneath it's doing all the serving and streaming.
What I learned
The bug that taught me the most was about holding on too long.
Early on, spinning the dial fast past a few stations would lock the whole radio up. Static, forever. The cause was almost embarrassing. Every tune starts a streaming generation that holds the model. Spin away before it finishes and that generation keeps running in the background, and every new tune queues up behind a broadcast you'll never hear.
The fix was a generation counter. Every tune bumps it, and every in-flight stream checks on each token whether it's still the current one. The moment it isn't, it stops itself. No cancel call, the stream just notices it's been abandoned and quits. Small fix, but it's the whole difference between a radio that feels alive when you spin it and one that breaks.
The win condition should never live inside the model.
The number station at 104.7 is the one place the model isn't deciding anything that counts. It performs resistance, an operator that doesn't want to talk to you, but whether you've cracked the cipher is a plain deterministic check in Python. The model acts, the code judges. I'd make that split anywhere winning is real: let the model be expressive where it's safe to be wrong, and keep the actual gate out of its hands. Any model can be talked into letting you win if you let it hold the gate, and a small one especially.
The fastest way to find a flaw is to hand it to a friend.
I'm proud of the engineering, but I didn't find half of these problems on my own. I sat people down, handed them the dial, and watched. The tuning that spun too freely, the wait that needed a loading bar, an ending that quietly broke, even small things like the static and the controls: almost every real improvement this week came from someone playing it and being kind enough to tell me what felt off. None of it was visible from the inside. If the radio feels good to use, a lot of the credit belongs to the people who tested it.
I underestimated this model, and that was the whole lesson.
The hard part of this build was direction, not capability. Once the model knew what I wanted, a billion parameters turned out to be plenty to carry a dozen voices, two languages, a running joke, and a cipher game. I came in bracing for limits I never actually hit. The thing I'll carry into the next project is that I was wrong about where the ceiling was, and I'd rather find that out by pushing a small model too far than by reaching for a bigger one out of habit.
Turn the dial. Somewhere out there a cat is about to lose at chess, a chef is making a tart of echoes, and their universe is waiting for someone to notice it's still broadcasting.
Try it: Live demo · Video · Model · Dataset
Mariana Sinisterra
GitHub: Mariana-Codebase | LinkedIn: marianasinisterra | marianacodebase.com | X: @MarianaCodebase