Trinity Large Thinking is an open-source reasoning model from Arcee AI, built on a 398B-parameter sparse Mixture-of-Experts architecture that activates approximately 13B parameters per token. Post-trained with extended chain-of-thought reasoning and agentic reinforcement learning, it achieves state-of-the-art results on agentic benchmarks including τ²-Bench (94.7%) and PinchBench (91.9%). Released under the Apache 2.0 license, it offers frontier-level tool use and multi-turn conversation capabilities that can be run fully locally or via hosted API.
Reasoning|Open ModelApache 2.0
Knowledge Cutoff
2024
Input → Output Format
Context Memory
262KIN262KOUT
AI Performance Evaluation
Arena Overall Score
1375
±6As of 2026-04-07
Overall Rank
No.119
12,625 Votes
Arena by Ability
Hard Prompts
1400±7No.115
Expert Knowledge
1414±20No.92
Instruction Following
1372±10No.112
Conversation Memory
1372±13No.121
Creative
1357±14No.104
Coding
1443±11No.92
Math
1362±20No.136
Arena by Occupation
Creative Writing
1358±11No.115
Social Sciences
1402±14No.110
Media
1355±13No.100
Business
1385±13No.107
Healthcare
1416±21No.99
Legal
1401±21No.98
Software
1425±9No.104
Mathematics
1380±24No.120
Source:Arena Intelligence
Overall
AA Intelligence Index
32%↓7%
LiveBench
30%↓30%
Reasoning & Math
GPQA Diamond
75%↓7%
HLE
15%↓3%
LB Reasoning
21%↓48%
LB Math
45%↓29%
LB Data
40%↓13%
Coding
AA Coding Index
27%↓9%
LB Coding
66%↓7%
LB Agentic
3.3%↓42%
TAU2
90%↑10%
TerminalBench
23%↓11%
SciCode
36%↓6%
Language & Instructions
IFBench
56%↓7%
AA-LCR
33%↓29%
Hallucination (HHEM)
6.9%↓3%
Factual (HHEM)
93%↑3%
LB Language
42%↓30%
LB IF
12%↓39%
Output Speed
Standard Mode
118tok/s↑41
First Output 17.54s