Qwen3
Qwen3 is a family of open-weight large language models by Alibaba Cloud's Qwen team, featuring both dense and Mixture-of-Experts architectures with seamless thinking and non-thinking modes.
At a Glance
About Qwen3
Qwen3 is a series of open-weight large language models developed by the Qwen team at Alibaba Cloud, available in dense and Mixture-of-Experts (MoE) variants ranging from 0.6B to 235B parameters. The models support seamless switching between a thinking mode (for complex reasoning, math, and coding) and a non-thinking mode (for efficient general-purpose chat). Qwen3 supports 100+ languages and dialects and achieves state-of-the-art performance among open-weight models on reasoning, coding, and agent benchmarks. The latest Qwen3-2507 update extends long-context understanding to 256K tokens, extendable to 1 million tokens.
- Dense and MoE model sizes: Available in 0.6B, 1.7B, 4B, 8B, 14B, 32B (dense) and 30B-A3B, 235B-A22B (MoE) to fit a wide range of hardware budgets.
- Thinking and non-thinking modes: Switch between deep reasoning mode and fast chat mode using
enable_thinkingflags or/think//no_thinkinstructions in the prompt. - Long-context support: Handles up to 256K tokens natively, extendable to 1 million tokens with updated Qwen3-2507 model variants.
- Multilingual capability: Supports 100+ languages and dialects with strong multilingual instruction following and translation.
- Agent and tool use: Integrates with Qwen-Agent for tool use and MCP support, enabling precise function calling in both thinking and non-thinking modes.
- Broad inference framework support: Run with Transformers, vLLM, SGLang, TensorRT-LLM, llama.cpp, Ollama, LM Studio, MLX LM, OpenVINO, ExecuTorch, and MNN for flexible local and cloud deployment.
- Finetuning support: Compatible with Axolotl, UnSloth, Swift, and LLaMA-Factory for SFT, DPO, and GRPO training workflows.
- Quantization: Supports GPTQ, AWQ, and GGUF quantization for efficient deployment on consumer hardware.
- Apache 2.0 license: All open-weight models are freely available for commercial and research use.
Community Discussions
Be the first to start a conversation about Qwen3
Share your experience with Qwen3, ask questions, or help others learn from your insights.
Pricing
Open Source
All Qwen3 open-weight models are free to download and use under the Apache 2.0 license.
- All model sizes (0.6B to 235B)
- Dense and MoE architectures
- Thinking and non-thinking modes
- Apache 2.0 license
- Commercial use allowed
Capabilities
Key Features
- Dense and MoE model architectures
- Thinking and non-thinking mode switching
- 256K token long-context support (extendable to 1M)
- 100+ language and dialect support
- Agent and tool use with MCP support
- Supports vLLM, SGLang, TensorRT-LLM, llama.cpp, Ollama, LM Studio
- GPTQ, AWQ, and GGUF quantization
- Finetuning with Axolotl, UnSloth, Swift, LLaMA-Factory
- OpenAI-compatible API server
- Apache 2.0 open-weight license
