Creator
Alibaba
Released
2025-07-25
Intelligence
29.5
Artificial Analysis Index
Coding
23.2
Artificial Analysis Index
In $/1M
$0.40
input tokens
Out $/1M
$2.15
output tokens
Blended $/1M
$0.84
3:1 blended
Speed
64
tokens / sec
Benchmark breakdown
Independent evaluation scores, normalized to 0–100.
GPQA Diamond
79
Humanity’s Last Exam
15
MMLU-Pro
84
SciCode
42
LiveCodeBench
79
MATH-500
98
AIME 2025
91
τ²-Bench (agentic)
53
Terminal-Bench Hard
14
IFBench
51