GPT-5.4 is OpenAI's latest frontier model released in March 2026, unifying the Codex and GPT product lines into a single system. It features a 1M+ token context window, native computer-use capabilities, and industry-leading coding performance inherited from GPT-5.3-Codex. The model is significantly more token-efficient than GPT-5.2, and achieves state-of-the-art results on knowledge work benchmarks, matching or exceeding industry professionals in 83% of comparisons across 44 occupations. It excels at agentic coding, document understanding, tool use, and complex multi-step workflows.
OpenAI PlusOpenAI ProAPI|VisionReasoningWeb SearchFile|Proprietary Model
Knowledge Cutoff
2025-08-31
Input → Output Format
Context Memory
1.1MIN128KOUT
AI Performance Evaluation
Arena Overall Score
1477
±5As of 2026-05-01
Overall Rank
No.11
15,853 Votes
Arena by Ability
Hard Prompts
1502±7No.9
Expert Knowledge
1524±17No.6
Instruction Following
1480±9No.8
Conversation Memory
1497±11No.7
Creative
1444±13No.22
Coding
1527±10No.8
Math
1514±18🥇 No.1
Arena by Occupation
Creative Writing
1467±10No.8
Social Sciences
1480±12No.30
Media
1448±12No.15
Business
1483±11No.10
Healthcare
1471±19No.42
Legal
1476±18No.26
Software
1510±8No.16
Mathematics
1516±20No.5
Source:Arena Intelligence
Overall
AA Intelligence Index
57%↑18%
LiveBench
81%↑20%
ForecastBench
59%↓1%
Reasoning & Math
GPQA Diamond
92%↑10%
HLE
42%↑24%
LB Reasoning
88%↑19%
LB Math
94%↑20%
LB Data
79%↑26%
Coding
AA Coding Index
57%↑21%
LB Coding
78%↑5%
LB Agentic
70%↑25%
TAU2
87%↑7%
TerminalBench
58%↑23%
SciCode
57%↑15%
Language & Instructions
IFBench
74%↑11%
AA-LCR
74%↑12%
Hallucination (HHEM)
7.0%↓3%
Factual (HHEM)
93%↑3%
LB Language
83%↑10%
LB IF
70%↑19%
Output Speed
Standard Mode
155tok/s↑78
First Output 0.49s
Reasoning Mode
158tok/s↑71
First Output 3.64s
Multilingual Capabilities
MGSM 🇰🇷
94%
MGSM 🇯🇵
92%
KMMLU 🇰🇷
77%
JMMLU 🇯🇵
75%