Skip to content

Welcome

About us

TeichAI is focused on one mission: making frontier AI capabilities accessible through open-source model distillation.

We take the reasoning capabilities of proprietary models like GPT-5, Claude 4.5 Opus, Gemini 3 Pro Preview, and DeepSeek v3.2, and transfer that knowledge to smaller, open-source models that you can run locally on your own hardware.

What is Model Distillation?

Knowledge distillation is a technique where a smaller β€œstudent” model learns to replicate the behavior of a larger β€œteacher” model. In our case:

  • Teacher Models: Frontier proprietary models (GPT-5, Claude 4.5, Gemini 3, etc.)
  • Student Models: Open-source base models (Qwen3, Nemotron, etc.)
  • Knowledge Transfer: We query the teacher models to generate high-quality reasoning traces, then fine-tune the student models on these outputs
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ DISTILLATION PIPELINE β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Teacher β”‚ Query β”‚ Dataset β”‚ β”‚
β”‚ β”‚ (GPT-5, β”‚ ──────► β”‚ (Reasoning β”‚ β”‚
β”‚ β”‚ Claude) β”‚ β”‚ Traces) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Student β”‚ SFT β”‚ Distilled β”‚ β”‚
β”‚ β”‚ (Qwen3, β”‚ ◄────── β”‚ Model β”‚ β”‚
β”‚ β”‚ Nemotron) β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Our Approach

High-Quality Datasets

We generate reasoning datasets by querying frontier models with high reasoning effort settings, capturing their step-by-step thought processes.

Efficient Training

Using Unsloth and LoRA, we can fine-tune models efficiently on consumer GPUs (as little as 16GB VRAM for 4B models).

Multiple Formats

Every model is released in both HuggingFace Transformers format (16-bit) and GGUF quantizations for local deployment.

Full Transparency

All datasets, training scripts, and model weights are publicly available. You can replicate everything we do.