Creator
Alibaba
Released
2025-09-23
Intelligence
27.6
Artificial Analysis Index
Coding
20.9
Artificial Analysis Index
In $/1M
$0.84
input tokens
Out $/1M
$6.17
output tokens
Blended $/1M
$2.17
3:1 blended
Speed
38
tokens / sec
Benchmark breakdown
Independent evaluation scores, normalized to 0–100.
GPQA Diamond
77
Humanity’s Last Exam
10
MMLU-Pro
84
SciCode
40
LiveCodeBench
65
MATH-500—
AIME 2025
88
τ²-Bench (agentic)
54
Terminal-Bench Hard
11
IFBench
56