MiniMax M2.5 is a frontier language model trained with reinforcement learning across hundreds of thousands of complex real-world environments, achieving state-of-the-art scores of 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp. Building on the coding expertise of M2.1, it extends into general office productivity — generating and operating Word, Excel, and PowerPoint files, context-switching between diverse software environments, and collaborating across agent and human teams. It completes evaluations 37% faster than M2.1 while being cost-efficient enough to run continuously for $1 per hour.
API|Reasoning|Open ModelModified MIT
Knowledge Cutoff
Unknown
Input → Output Format
Context Memory
197KIN66KOUT
AI Performance Evaluation
Arena Overall Score
1397
±5As of 2026-05-01
Overall Rank
No.112
23,488 Votes
Arena by Ability
Hard Prompts
1422±6No.101
Expert Knowledge
1436±14No.82
Instruction Following
1395±7No.102
Conversation Memory
1406±9No.105
Creative
1373±10No.106
Coding
1454±8No.97
Math
1407±15No.96
Arena by Occupation
Creative Writing
1382±8No.102
Social Sciences
1410±10No.115
Media
1378±9No.95
Business
1411±9No.94
Healthcare
1407±15No.123
Legal
1410±15No.102
Software
1440±7No.100
Mathematics
1409±17No.106
Source:Arena Intelligence
Overall
AA Intelligence Index
42%↑3%
LiveBench
60%↑0%
Reasoning & Math
GPQA Diamond
85%↑3%
HLE
19%↑2%
LB Reasoning
59%↓10%
LB Math
77%↑3%
LB Data
50%↓4%
Coding
AA Coding Index
37%↑1%
LB Coding
71%↓2%
LB Agentic
52%↑7%
TAU2
95%↑15%
TerminalBench
35%↑1%
SciCode
43%↑1%
Language & Instructions
IFBench
72%↑8%
AA-LCR
66%↑4%
LB Language
55%↓17%
LB IF
57%↑6%
Output Speed
Standard Mode
77tok/s↑0
First Output 27.42s