Creator
Alibaba
Released
2024-09-19
Intelligence
15.6
Artificial Analysis Index
Coding
11.9
Artificial Analysis Index
In $/1M
$0.36
input tokens
Out $/1M
$0.40
output tokens
Blended $/1M
$0.37
3:1 blended
Speed
0
tokens / sec
Benchmark breakdown
Independent evaluation scores, normalized to 0–100.
GPQA Diamond
49
Humanity’s Last Exam
4
MMLU-Pro
72
SciCode
27
LiveCodeBench
28
MATH-500
86
AIME 2025
14
τ²-Bench (agentic)
35
Terminal-Bench Hard
5
IFBench
37