High-Quality Datasets
We generate reasoning datasets by querying frontier models with high reasoning effort settings, capturing their step-by-step thought processes.
TeichAI is focused on one mission: making frontier AI capabilities accessible through open-source model distillation.
We take the reasoning capabilities of proprietary models like GPT-5, Claude 4.5 Opus, Gemini 3 Pro Preview, and DeepSeek v3.2, and transfer that knowledge to smaller, open-source models that you can run locally on your own hardware.
Knowledge distillation is a technique where a smaller βstudentβ model learns to replicate the behavior of a larger βteacherβ model. In our case:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ DISTILLATION PIPELINE ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€β ββ ββββββββββββββββ ββββββββββββββββ ββ β Teacher β Query β Dataset β ββ β (GPT-5, β βββββββΊ β (Reasoning β ββ β Claude) β β Traces) β ββ ββββββββββββββββ ββββββββ¬ββββββββ ββ β ββ βΌ ββ ββββββββββββββββ ββββββββββββββββ ββ β Student β SFT β Distilled β ββ β (Qwen3, β βββββββ β Model β ββ β Nemotron) β β β ββ ββββββββββββββββ ββββββββββββββββ ββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββHigh-Quality Datasets
We generate reasoning datasets by querying frontier models with high reasoning effort settings, capturing their step-by-step thought processes.
Efficient Training
Using Unsloth and LoRA, we can fine-tune models efficiently on consumer GPUs (as little as 16GB VRAM for 4B models).
Multiple Formats
Every model is released in both HuggingFace Transformers format (16-bit) and GGUF quantizations for local deployment.
Full Transparency
All datasets, training scripts, and model weights are publicly available. You can replicate everything we do.