Unsloth
Fine-tune and train LLMs up to 30x faster with 90% less memory usage through optimized GPU kernels and handwritten math derivations.
At a Glance
Pricing
Freeware of our standard version of unsloth
Engagement
Available On
About Unsloth
Unsloth is an open-source library that dramatically accelerates LLM fine-tuning and training by manually deriving compute-heavy math steps and handwriting GPU kernels. It enables users to train custom models in 24 hours instead of 30 days, achieving up to 30x faster performance than Flash Attention 2 (FA2) while using 90% less memory. The tool supports a wide range of NVIDIA GPUs from Tesla T4 to H100, with portability to AMD and Intel GPUs.
-
Massive Speed Improvements - Achieve 2x faster training on a single GPU with the free version, scaling up to 30x faster on multi-GPU systems compared to Flash Attention 2.
-
Significant Memory Reduction - Use up to 90% less VRAM than traditional methods, enabling training of larger models on existing hardware without upgrades.
-
Broad Model Support - Compatible with popular models including Mistral, Gemma, and LLama 1, 2, and 3, with support for 4-bit and 16-bit LoRA fine-tuning.
-
500K Context Fine-tuning - Train models with extremely long context lengths up to 500K tokens for advanced use cases.
-
FP8 Reinforcement Learning - Support for FP8 RL training including GRPO for efficient reinforcement learning workflows.
-
Docker Support - Easy deployment with official Docker images for containerized training environments.
-
Multi-GPU and Multi-Node - Enterprise tier supports up to 32x GPU scaling with multi-node support for large-scale training operations.
-
Accuracy Improvements - Enterprise users can achieve up to 30% accuracy improvements alongside speed gains.
To get started, users can access the free open-source version on GitHub and run it on Google Colab or Kaggle Notebooks. The library integrates seamlessly with existing ML workflows and requires no hardware changes to achieve performance improvements. Documentation is available at docs.unsloth.ai with comprehensive guides and model resources on Hugging Face.

Community Discussions
Be the first to start a conversation about Unsloth
Share your experience with Unsloth, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
Freeware of our standard version of unsloth
- Open-source
- Supports Mistral, Gemma
- Supports LLama 1, 2, 3
- MultiGPU - coming soon
- Supports 4 bit, 16 bit LoRA
Unsloth Pro
2.5x faster training + 20% less VRAM
- 2.5x number of GPUs faster than FA2
- 20% less memory than OSS
- Enhanced MultiGPU support
- Up to 8 GPUS support
- For any usecase
- 80% VRAM reduction
- Supports 4 bit, 16 bit LoRA
Unsloth Enterprise
Unlock 30x faster training + multi-node support + 30% accuracy
- 32x number of GPUs faster than FA2
- up to +30% accuracy
- 5x faster inference
- Supports full training
- All Pro plan features
- Multi-node support
- Customer support
- 90% VRAM reduction
- Multi + node GPU support
- Supports 4 bit, 16 bit LoRA
Capabilities
Key Features
- 30x faster training than Flash Attention 2
- 90% less memory usage
- Support for Mistral, Gemma, LLama 1, 2, 3
- 4-bit and 16-bit LoRA support
- 500K context length fine-tuning
- FP8 reinforcement learning (GRPO)
- Docker image support
- Multi-GPU support
- Multi-node support (Enterprise)
- TTS, BERT, FFT support
- 5x faster inference (Enterprise)
- Full training support (Enterprise)