Chat Templates

Overview

Chat templates control how conversations are formatted for model input. Using the correct template is essential for proper model behavior.

Available Templates

Template	Model Type	Think Tags
`qwen3`	Base Qwen3	Optional
`qwen3-thinking`	Thinking variants	Required
`qwen3-instruct`	Instruct variants	No

Setting Chat Templates

from unsloth.chat_templates import get_chat_template

# Apply template to tokenizer
tokenizer = get_chat_template(tokenizer, chat_template="qwen3-thinking")

Template Formats

For models that output explicit reasoning in <think> tags.

Input format:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What is 15% of 240?<|im_end|>
<|im_start|>assistant
<think>
To find 15% of 240, I need to multiply 240 by 0.15.
240 × 0.15 = 36
</think>

The answer is 36.<|im_end|>

Use with: Qwen3-4B-Thinking-2507, Qwen3-30B-A3B-Thinking-2507, etc.

For models that provide direct answers without explicit reasoning.

Input format:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What is 15% of 240?<|im_end|>
<|im_start|>assistant
The answer is 36.<|im_end|>

Use with: Qwen3-4B-Instruct-2507, etc.

Base template compatible with both thinking and non-thinking outputs.

Use with: Qwen3-4B, Qwen3-8B, etc.

Matching Template to Model

Model Name Contains	Template
`Thinking`	`qwen3-thinking`
`Instruct`	`qwen3-instruct`
Neither	`qwen3`

Formatting Training Data

def formatting_prompts_func(examples):
    convos = examples["messages"]
    texts = [
        tokenizer.apply_chat_template(
            convo,
            tokenize=False,
            add_generation_prompt=False
        )
        for convo in convos
    ]
    return {"text": texts}

train_dataset = raw_dataset.map(formatting_prompts_func, batched=True)

Inference with Templates

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain quantum entanglement."}
]

# Format for model input
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,  # Add assistant prefix
    return_tensors="pt"
)

# Generate
outputs = model.generate(inputs, max_new_tokens=1024)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Custom Templates

For custom chat templates, you can define your own:

custom_template = """{% for message in messages %}
<|{{ message.role }}|>
{{ message.content }}
<|end|>
{% endfor %}
<|assistant|>
"""

tokenizer.chat_template = custom_template

Troubleshooting

Model outputs gibberish

Check that your chat template matches the model type. Thinking models need qwen3-thinking.

Missing `<think>` tags in output

Verify you’re using a Thinking model variant
Check that the training data includes think tags
Ensure qwen3-thinking template is applied

Double special tokens

If you see repeated <|im_start|> or similar:

Check add_special_tokens=False when manually tokenizing
Ensure template isn’t applied twice