DeepSeek V3

published

DeepSeek's flagship MoE model

Chinese-developed MoE model with 671B total/37B active parameters. 88.5% MMLU, extremely low inference cost.

Provider deepseek Type llm Access open_source Params 671B MoE (37B active) Context 128k License mit

Benchmarks (2)

BenchmarkScoreReported BySource
MMLU88.5%vendorsource ↗
MMLU-Pro75.9%vendorsource ↗

Why It Matters

Trained for a fraction of the cost of Western frontier models. MIT license. Proved that frontier performance doesn't require frontier budgets.

Known Limitations

Potential data sourcing concerns. Chinese regulatory environment affects availability. V3.2 supersedes on most tasks.

Provider deepseek
Released 2024-12-26
Training cutoff 2024-11
Created March 22, 2026
Last reconciled Never