OpenAI
OpenAI

GPT OSS 120B

2025-08-05

GPT-OSS-120B is OpenAI's first open-weight language model, featuring 117 billion total parameters in a Mixture-of-Experts architecture that activates just 5.1 billion per forward pass. Optimized to run on a single 80GB GPU with native MXFP4 quantization, it achieves near-parity with o4-mini on core reasoning benchmarks while supporting configurable reasoning depth, full chain-of-thought access, and native tool use including function calling and structured outputs. Released under the Apache 2.0 license, it brings frontier-level reasoning and agentic capabilities to a fully customizable, locally deployable model.

API|Reasoning|Open ModelApache 2.0
Knowledge Cutoff
2024-06-30
Input → Output Format
Context Memory
131KIN131KOUT
Cost/1M Words
$0.039IN$0.18OUT
Calculate Cost

AI Performance Evaluation

Arena Overall Score
1353
±4
As of 2026-05-01
Overall Rank
No.160
30,670 Votes
Arena by Ability
Hard Prompts
1362±6No.165
Expert Knowledge
1360±17No.156
Instruction Following
1326±7No.172
Conversation Memory
1328±9No.180
Creative
1279±10No.212
Coding
1390±8No.163
Math
1383±14No.133
Arena by Occupation
Creative Writing
1310±8No.186
Social Sciences
1361±9No.169
Media
1287±8No.193
Business
1350±8No.163
Healthcare
1369±15No.159
Legal
1345±14No.179
Software
1386±6No.162
Mathematics
1384±15No.134
Overall
AA Intelligence Index
25%↓15%
LiveBench
46%↓14%
Reasoning & Math
AA Math Index
67%↓8%
GPQA Diamond
67%↓15%
HLE
5.2%↓12%
MMLU-Pro
78%↓4%
AIME 2025
67%↓8%
LB Reasoning
39%↓30%
LB Math
69%↓5%
LB Data
39%↓14%
Coding
AA Coding Index
16%↓21%
LiveCodeBench
71%↑5%
LB Coding
60%↓13%
LB Agentic
17%↓28%
TAU2
45%↓35%
TerminalBench
5.3%↓29%
SciCode
36%↓6%
Language & Instructions
IFBench
58%↓5%
AA-LCR
44%↓18%
Hallucination (HHEM)
14%↑4%
Factual (HHEM)
86%↓4%
LB Language
49%↓24%
LB IF
50%↓1%
Output Speed
Standard Mode
86tok/s↑9
First Output 0.48s
Reasoning Mode
233tok/s↑146
First Output 9.09s