Instructions to use iFlytekOpenSource/Domux with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use iFlytekOpenSource/Domux with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="iFlytekOpenSource/Domux") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("iFlytekOpenSource/Domux") model = AutoModelForMultimodalLM.from_pretrained("iFlytekOpenSource/Domux") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use iFlytekOpenSource/Domux with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "iFlytekOpenSource/Domux" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "iFlytekOpenSource/Domux", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/iFlytekOpenSource/Domux
- SGLang
How to use iFlytekOpenSource/Domux with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "iFlytekOpenSource/Domux" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "iFlytekOpenSource/Domux", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "iFlytekOpenSource/Domux" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "iFlytekOpenSource/Domux", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use iFlytekOpenSource/Domux with Docker Model Runner:
docker model run hf.co/iFlytekOpenSource/Domux
Access Domux on Hugging Face
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Domux is a derivative of Google's Gemma model and its weights are governed by the Gemma Terms of Use. To access the weights, you must review and agree to the Gemma Terms of Use and Prohibited Use Policy.
Log in or Sign Up to review the conditions and access this model content.
Domux (Domux-Gemma-4-E2B-it) is a fine-tuned language model built on Gemma-4-E2B-it. It turns natural-language smart-home commands into structured, pipe-delimited slots. Training combines supervised fine-tuning (SFT) with reinforcement learning via Group Relative Policy Optimization (GRPO) and custom reward functions.
📦 Code, training scripts, evaluation suite, full benchmark report and dataset live in the GitHub repository: github.com/iflytek/domux.
✨ Key Features
- Fast response — Optimized for low-latency inference on edge devices and servers.
- Structured slot output — Parses free-form commands into a fixed 7-field pipe-delimited schema.
- High accuracy — 98.37% result accuracy with 100% format compliance, outperforming much larger models.
- Lightweight base — Built on the compact Gemma-4-E2B-it, suitable for on-device and edge deployment.
- Multi-action support — Handles compound commands that map to multiple slot lines.
- Generalizes across devices — Handles arbitrary device names within each category, not a fixed whitelist.
🎬 Output Format
The model outputs pipe-delimited slots with 7 fields. Use * for unspecified or don't-care fields.
action|device|attribute|value|unit|room|floor
Basic Examples
| Input | Output |
|---|---|
| Turn on the living room light | turnOn|Light|*|*|*|Living Room|* |
| Set bedroom AC to 22 degrees | set|AC|temperature|22|Celsius|Bedroom|* |
| Close the curtains 20 percent | adjustDown|Curtain|openness|20|Percent|*|* |
Complex Multi-Attribute Command
Input:
Turn on the Master Light in the Master Bedroom on the Second Floor,
set brightness to 80%, color temperature to 4000K, color to Blue, and mode to Reading.
Output:
turnOn|Light|*|*|*|Master Bedroom|Second Floor
set|Light|brightness|80|Percent|Master Bedroom|Second Floor
set|Light|colorTemperature|4000|Kelvin|Master Bedroom|Second Floor
set|Light|color|Blue|*|Master Bedroom|Second Floor
set|Light|mode|Reading|*|Master Bedroom|Second Floor
Full specification: Output Format Documentation.
🏠 Supported Control Capabilities
Domux does not rely on a fixed device whitelist — it handles diverse device names through semantic understanding.
| Device Type | Naming Examples | Controllable Attributes | Value Range |
|---|---|---|---|
| Light | Light, Strip Light, Spot Light, Desk Lamp | brightness / color / colorTemperature / mode |
0–100% / Blue, Red, Green… / 3000–6500 K / Reading, Romance, Soft… |
| AC | AC, AC 1 | temperature / mode / windSpeed |
16–30 °C / Cool, Heat, Dry, Fan, Auto / Low, Medium, High |
| Curtain / Blind | Curtain, Blind, Sheer Curtain | position |
0–100% |
| Scene Mode | Romantic Mode, Party Mode, Sleeping Mode | — | — |
Actions: turnOn, turnOff, set, adjustUp, adjustDown, activate, deactivate, pause.
Spatial context: rooms (Living Room, Bedroom, Kitchen, Majlis, Prayer Room…) and floors (Ground Floor, Upstairs, Downstairs…), including numbered variants.
📊 Benchmark
Evaluated on a comprehensive test set of 4,057 samples across 4 dimensions (single intent, multi-intent, omitted attributes, non-standard naming), benchmarked against 11 mainstream models including Qwen3.5 series (2B-27B), Gemma 4 series, and leading closed-source APIs (DeepSeek-V4, Claude Haiku 4.5, Gemini 3.5 Flash).
Result accuracy reaches 98.37% with 100% format compliance.
📄 Full technical report and benchmark charts: GitHub repository.
The test set and evaluation script are open-sourced under eval/ so you can reproduce the results or evaluate your own model.
🚀 Quick Start
Hardware
The model runs in BF16 precision and requires 20GB+ of VRAM for single-GPU deployment.
Download
# Hugging Face
git lfs install
git clone https://huggingface.co/iFlytekOpenSource/Domux
Inference with vLLM
pip install "vllm==0.22.0"
from vllm import LLM, SamplingParams
llm = LLM(model="iFlytekOpenSource/Domux", dtype="bfloat16")
sampling = SamplingParams(temperature=0.0, max_tokens=256)
prompt = "Turn on the Master Light in the Master Bedroom on the Second Floor, set brightness to 80%, color temperature to 4000K, color to Blue, and mode to Reading."
output = llm.chat([{"role": "user", "content": prompt}], sampling)
print(output[0].outputs[0].text)
# Output:
# turnOn|Light|*|*|*|Master Bedroom|Second Floor
# set|Light|brightness|80|Percent|Master Bedroom|Second Floor
# set|Light|colorTemperature|4000|Kelvin|Master Bedroom|Second Floor
# set|Light|color|Blue|*|Master Bedroom|Second Floor
# set|Light|mode|Reading|*|Master Bedroom|Second Floor
Serve as an OpenAI-compatible API
# vLLM
python -m vllm.entrypoints.openai.api_server \
--model iFlytekOpenSource/Domux \
--served-model-name domux \
--host 0.0.0.0 --port 8000 \
--dtype bfloat16 --max-model-len 2048 --gpu-memory-utilization 0.9
# SGLang
pip install "sglang[all]==0.5.12"
python -m sglang.launch_server \
--model-path iFlytekOpenSource/Domux \
--host 0.0.0.0 --port 8000 \
--dtype bfloat16 --context-length 2048
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
response = client.chat.completions.create(
model="domux",
messages=[{"role": "user", "content": "Turn on the bedroom light and set brightness to 60%"}],
temperature=0.0,
)
print(response.choices[0].message.content)
📄 License
The model weights are a derivative of Google's Gemma and are made available under, and your use of them is governed by, the Gemma Terms of Use and the Gemma Prohibited Use Policy. "Gemma" is a trademark of Google LLC.
The accompanying source code (training scripts, reward plugins, evaluation tooling) in the GitHub repository is licensed under Apache-2.0.
🙏 Acknowledgments
- Base model: Gemma
- Training framework: ModelScope-Swift
- Experiment tracking: SwanLab
Citation
@misc{domux2026,
title = {Domux: A Lightweight Low-Latency Command Understanding Model for Smart-Home Control},
author = {iFLYTEK CO., LTD.},
year = {2026},
url = {https://github.com/iflytek/domux}
}
- Downloads last month
- -