Creator
Alibaba
Released
2025-07-31
Intelligence
20.0
Artificial Analysis Index
Coding
19.4
Artificial Analysis Index
In $/1M
$0.19
input tokens
Out $/1M
$0.84
output tokens
Blended $/1M
$0.35
3:1 blended
Speed
96
tokens / sec
Benchmark breakdown
Independent evaluation scores, normalized to 0–100.
GPQA Diamond
52
Humanity’s Last Exam
4
MMLU-Pro
71
SciCode
28
LiveCodeBench
40
MATH-500
89
AIME 2025
29
τ²-Bench (agentic)
35
Terminal-Bench Hard
15
IFBench
33