Instructions to use iFlytekOpenSource/Domux with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use iFlytekOpenSource/Domux with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="iFlytekOpenSource/Domux")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("iFlytekOpenSource/Domux")
model = AutoModelForMultimodalLM.from_pretrained("iFlytekOpenSource/Domux")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use iFlytekOpenSource/Domux with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "iFlytekOpenSource/Domux"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iFlytekOpenSource/Domux",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/iFlytekOpenSource/Domux

SGLang

How to use iFlytekOpenSource/Domux with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "iFlytekOpenSource/Domux" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iFlytekOpenSource/Domux",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "iFlytekOpenSource/Domux" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "iFlytekOpenSource/Domux",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use iFlytekOpenSource/Domux with Docker Model Runner:
```
docker model run hf.co/iFlytekOpenSource/Domux
```

Access Domux on Hugging Face

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Domux is a derivative of Google's Gemma model and its weights are governed by the Gemma Terms of Use. To access the weights, you must review and agree to the Gemma Terms of Use and Prohibited Use Policy.

Domux

A lightweight, low-latency command understanding model for smart-home control.

Domux (Domux-Gemma-4-E2B-it) is a fine-tuned language model built on Gemma-4-E2B-it. It turns natural-language smart-home commands into structured, pipe-delimited slots. Training combines supervised fine-tuning (SFT) with reinforcement learning via Group Relative Policy Optimization (GRPO) and custom reward functions.

📦 Code, training scripts, evaluation suite, full benchmark report and dataset live in the GitHub repository: github.com/iflytek/domux.

✨ Key Features

Fast response — Optimized for low-latency inference on edge devices and servers.
Structured slot output — Parses free-form commands into a fixed 7-field pipe-delimited schema.
High accuracy — 98.37% result accuracy with 100% format compliance, outperforming much larger models.
Lightweight base — Built on the compact Gemma-4-E2B-it, suitable for on-device and edge deployment.
Multi-action support — Handles compound commands that map to multiple slot lines.
Generalizes across devices — Handles arbitrary device names within each category, not a fixed whitelist.

🎬 Output Format

The model outputs pipe-delimited slots with 7 fields. Use * for unspecified or don't-care fields.

action|device|attribute|value|unit|room|floor

Basic Examples

Input	Output
Turn on the living room light	`turnOn\|Light\|\|\|\|Living Room\|`
Set bedroom AC to 22 degrees	`set\|AC\|temperature\|22\|Celsius\|Bedroom\|*`
Close the curtains 20 percent	`adjustDown\|Curtain\|openness\|20\|Percent\|\|`

Complex Multi-Attribute Command

Input:

Turn on the Master Light in the Master Bedroom on the Second Floor,
set brightness to 80%, color temperature to 4000K, color to Blue, and mode to Reading.

Output:

turnOn|Light|*|*|*|Master Bedroom|Second Floor
set|Light|brightness|80|Percent|Master Bedroom|Second Floor
set|Light|colorTemperature|4000|Kelvin|Master Bedroom|Second Floor
set|Light|color|Blue|*|Master Bedroom|Second Floor
set|Light|mode|Reading|*|Master Bedroom|Second Floor

Full specification: Output Format Documentation.

🏠 Supported Control Capabilities

Domux does not rely on a fixed device whitelist — it handles diverse device names through semantic understanding.

Device Type	Naming Examples	Controllable Attributes	Value Range
Light	Light, Strip Light, Spot Light, Desk Lamp	`brightness` / `color` / `colorTemperature` / `mode`	0–100% / Blue, Red, Green… / 3000–6500 K / Reading, Romance, Soft…
AC	AC, AC 1	`temperature` / `mode` / `windSpeed`	16–30 °C / Cool, Heat, Dry, Fan, Auto / Low, Medium, High
Curtain / Blind	Curtain, Blind, Sheer Curtain	`position`	0–100%
Scene Mode	Romantic Mode, Party Mode, Sleeping Mode	—	—

Actions: turnOn, turnOff, set, adjustUp, adjustDown, activate, deactivate, pause.

Spatial context: rooms (Living Room, Bedroom, Kitchen, Majlis, Prayer Room…) and floors (Ground Floor, Upstairs, Downstairs…), including numbered variants.

📊 Benchmark

Evaluated on a comprehensive test set of 4,057 samples across 4 dimensions (single intent, multi-intent, omitted attributes, non-standard naming), benchmarked against 11 mainstream models including Qwen3.5 series (2B-27B), Gemma 4 series, and leading closed-source APIs (DeepSeek-V4, Claude Haiku 4.5, Gemini 3.5 Flash).

Result accuracy reaches 98.37% with 100% format compliance.

📄 Full technical report and benchmark charts: GitHub repository.

The test set and evaluation script are open-sourced under eval/ so you can reproduce the results or evaluate your own model.

🚀 Quick Start

Hardware

The model runs in BF16 precision and requires 20GB+ of VRAM for single-GPU deployment.

Download

# Hugging Face
git lfs install
git clone https://huggingface.co/iFlytekOpenSource/Domux

Inference with vLLM

pip install "vllm==0.22.0"

from vllm import LLM, SamplingParams

llm = LLM(model="iFlytekOpenSource/Domux", dtype="bfloat16")
sampling = SamplingParams(temperature=0.0, max_tokens=256)

prompt = "Turn on the Master Light in the Master Bedroom on the Second Floor, set brightness to 80%, color temperature to 4000K, color to Blue, and mode to Reading."
output = llm.chat([{"role": "user", "content": prompt}], sampling)
print(output[0].outputs[0].text)

# Output:
# turnOn|Light|*|*|*|Master Bedroom|Second Floor
# set|Light|brightness|80|Percent|Master Bedroom|Second Floor
# set|Light|colorTemperature|4000|Kelvin|Master Bedroom|Second Floor
# set|Light|color|Blue|*|Master Bedroom|Second Floor
# set|Light|mode|Reading|*|Master Bedroom|Second Floor

Serve as an OpenAI-compatible API

# vLLM
python -m vllm.entrypoints.openai.api_server \
  --model iFlytekOpenSource/Domux \
  --served-model-name domux \
  --host 0.0.0.0 --port 8000 \
  --dtype bfloat16 --max-model-len 2048 --gpu-memory-utilization 0.9

# SGLang
pip install "sglang[all]==0.5.12"
python -m sglang.launch_server \
  --model-path iFlytekOpenSource/Domux \
  --host 0.0.0.0 --port 8000 \
  --dtype bfloat16 --context-length 2048

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
response = client.chat.completions.create(
    model="domux",
    messages=[{"role": "user", "content": "Turn on the bedroom light and set brightness to 60%"}],
    temperature=0.0,
)
print(response.choices[0].message.content)

📄 License

The model weights are a derivative of Google's Gemma and are made available under, and your use of them is governed by, the Gemma Terms of Use and the Gemma Prohibited Use Policy. "Gemma" is a trademark of Google LLC.

The accompanying source code (training scripts, reward plugins, evaluation tooling) in the GitHub repository is licensed under Apache-2.0.

🙏 Acknowledgments

Base model: Gemma
Training framework: ModelScope-Swift
Experiment tracking: SwanLab

Citation

@misc{domux2026,
  title  = {Domux: A Lightweight Low-Latency Command Understanding Model for Smart-Home Control},
  author = {iFLYTEK CO., LTD.},
  year   = {2026},
  url    = {https://github.com/iflytek/domux}
}

Downloads last month: -

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for iFlytekOpenSource/Domux

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Finetuned

(252)

this model