Tinker
Tinker is an API for efficient LoRA fine-tuning of large language models—you write simple Python scripts with your data and training logic, and Tinker handles distributed GPU training.
At a Glance
Pricing
New users currently receive $150 in promotional credits to get started with Tinker (valid for 1 year)
Engagement
Available On
About Tinker
Tinker from Thinking Machines is a training API that lets researchers and developers focus on data and algorithms while handling the complexity of distributed training. You write a simple loop that runs on your local machine—including your data, environment, and loss function—and Tinker runs the computation efficiently across GPU clusters. Changing models is a single string change in your code.
- Clean abstraction, full control — Tinker shields you from distributed training complexity while preserving control over your training loop, loss functions, and algorithmic details. It's not a black box—it's a powerful abstraction.
- API-driven training primitives — Use forward_backward(), optim_step(), sample(), and save_state() to control training loops programmatically from simple Python scripts.
- Large model support — Fine-tune models from Llama (1B–70B), Qwen (4B–235B including MoE), DeepSeek-V3.1, GPT-OSS, and Kimi-K2 series. VLM support for image understanding with Qwen3-VL models.
- LoRA fine-tuning — Uses parameter-efficient LoRA adaptation, which matches full fine-tuning performance for many use cases while requiring less compute.
- Fault-tolerant distributed training — Hardware failures are handled transparently; training runs reliably on distributed GPU infrastructure.
- Model export — Download trained weights to use with your inference provider of choice.
To get started, read the Tinker Cookbook, run the simple Python examples, and adapt the provided recipes for supervised learning or RL workflows to your dataset.

Community Discussions
Be the first to start a conversation about Tinker
Share your experience with Tinker, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
New users currently receive $150 in promotional credits to get started with Tinker (valid for 1 year)
- $150 promotional credit upon signup
- Full API access to all training primitives
- Access to all supported models
- Credits valid for 1 year from activation
- Usage-based pricing after credits expire
Pay-As-You-Go
Usage-based pricing per million tokens. Rates vary by model and operation type (prefill, sample, train). Training rates range from $0.09/M tokens (Llama-3.2-1B) to $3.38/M tokens (DeepSeek-V3.1). Storage billed at $0.031/GB per month.
- Pay only for tokens processed (prefill, sample, train operations)
- Llama models: $0.09 - $3.16 per million tokens (training)
- Qwen models: $0.22 - $3.07 per million tokens (training)
- DeepSeek-V3.1: $3.38 per million tokens (training)
- GPT-OSS models: $0.36 - $0.52 per million tokens (training)
- Kimi-K2-Thinking: $2.93 per million tokens (training)
- Storage: $0.031 per GB/month (free during beta)
- No minimum commitment or monthly fees
Enterprise
Custom pricing and capacity planning for organizations with large-scale training needs. Contact sales for dedicated support and guaranteed capacity.
- Custom pricing based on volume and usage patterns
- Dedicated support and capacity planning
- Priority access to GPU clusters
- Guaranteed uptime and SLA
- Volume discounts available
Capabilities
Key Features
- LoRA fine-tuning (parameter-efficient, matches full fine-tuning performance)
- Distributed, fault-tolerant training for large models (Llama 70B, Qwen 235B)
- Vision-language model (VLM) support for image understanding tasks
- API primitives: forward_backward(), optim_step(), sample(), save_state()
- Download trained model weights for external inference
- Supports supervised learning and RL workflows (RLHF, DPO)
- Usage-based pricing starting at $0.09 per million tokens