deepseek-ai/DeepSeek-V3
DeepSeek-V3 is a 671B parameter Mixture-of-Experts language model that competes with GPT-4o and Claude Sonnet — while being fully open-source. Only 37B parameters are active per token, keeping inference costs far below what the total parameter count implies. Trained on 14.8 trillion tokens with knowledge distilled from DeepSeek-R1 reasoning models, it hits the top of open-source benchmarks across code, math, and multilingual tasks.
Why It Matters
For developers and researchers who can't or won't pay for closed-source API access, DeepSeek-V3 is the strongest option available. The model achieves performance comparable to frontier proprietary models at a fraction of the training cost — OpenAI reportedly spent 10x more on comparable model training. With 102K GitHub stars and support across SGLang, vLLM, LMDeploy, and TensorRT-LLM, it runs on standard GPU infrastructure. The 128K context window handles large codebases and long documents without choking. For teams building on top of open weights, this changes what's possible without enterprise API contracts.