QwenLM/Qwen3

Qwen3 is the flagship open-weight large language model series from Alibaba Cloud's Qwen team, offering one of the most comprehensive lineups in the open-source AI ecosystem. The repository serves as the central hub for a family of models spanning dense architectures (0.6B, 1.7B, 4B, 8B, 14B, and 32B parameters) and mixture-of-experts designs (30B-A3B and 235B-A22B), giving developers and researchers granular control over the compute-performance tradeoff for their specific deployment scenario. What distinguishes Qwen3 from other open-weight model families is its hybrid thinking architecture. Every model in the series supports seamless switching between a step-by-step reasoning mode for complex logic, mathematics, and code generation, and a rapid non-thinking mode for straightforward queries. Users can configure thinking budgets to balance latency against reasoning depth, making the models adaptable to both real-time applications and offline batch processing. Trained on approximately 36 trillion tokens, double the training corpus of Qwen2.5, the models demonstrate strong multilingual capabilities across 119 languages and dialects. Context windows range from 32K tokens on smaller models to 128K on larger variants, with experimental support extending to 1 million tokens in the Qwen3-2507 update released in August 2025. The series has evolved further with Qwen3.5, which introduced compact models from 0.8B to 9B parameters optimized for on-device deployment using a hybrid Gated Delta Networks and sparse MoE architecture. Qwen3 integrates natively with popular inference frameworks including vLLM, SGLang, TensorRT-LLM, llama.cpp, and Ollama, and ships with enhanced agentic capabilities for tool calling and MCP (Model Context Protocol) support. All models are released under the Apache 2.0 license and available on Hugging Face, ModelScope, and Kaggle.

models

Python

Why It Matters

Qwen3 represents a pivotal shift in the open-weight model landscape because it delivers frontier-level performance without the proprietary restrictions that limit most competing models. The flagship 235B-A22B variant competes directly with DeepSeek-R1, OpenAI o1, and Gemini 2.5 Pro on coding and mathematics benchmarks, while the smaller Qwen3-4B reportedly matches the performance of Qwen2.5-72B-Instruct, a model with 18 times more parameters. This efficiency breakthrough means that serious AI applications can now run on consumer hardware rather than requiring expensive GPU clusters. The hybrid thinking mode is particularly significant for the developer community. Rather than choosing between a fast but shallow model and a slow but thorough reasoning model, Qwen3 lets users dial reasoning depth on a per-request basis. Combined with native MCP support and strong tool-calling capabilities, this makes Qwen3 one of the most versatile foundations for building autonomous AI agents. The rapid expansion to Qwen3.5 Small models further extends reach to edge devices, smartphones, and embedded systems, democratizing access to capable language models across the entire compute spectrum.

Repository Stats

Stars

26.9k

Forks

1.9k

Last Commit

1/9/2026

View on GitHub Visit Website

Related Resources

AI News & Articles

Read about the latest developments related to QwenLM/Qwen3

AI Tools Directory

Compare commercial AI tools and find the right one for your workflow