OpenAI

GPT-5.4

Name: OpenAI GPT-5.4
Author: OpenAI

Compare

Model ID:gpt-5.4-2026-03-05

2026-03-05

Compare

GPT-5.4 is OpenAI's latest frontier model released in March 2026, unifying the Codex and GPT product lines into a single system. It features a 1M+ token context window, native computer-use capabilities, and industry-leading coding performance inherited from GPT-5.3-Codex. The model is significantly more token-efficient than GPT-5.2, and achieves state-of-the-art results on knowledge work benchmarks, matching or exceeding industry professionals in 83% of comparisons across 44 occupations. It excels at agentic coding, document understanding, tool use, and complex multi-step workflows.

OpenAI PlusOpenAI ProAPI|VisionReasoningWeb SearchFile|Proprietary Model

Knowledge Cutoff

2025-08-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1.1MIN128KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$2.5IN$15OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenAI GPT-5 Blog LMSYS Chatbot Arena OpenRouter

AI Performance Evaluation

Arena Overall Score

1477

±5

As of 2026-05-01

Overall Rank

No.11

15,853 Votes

Arena by Ability

Hard Prompts

1502±7No.9

Expert Knowledge

1524±17No.6

Instruction Following

1480±9No.8

Conversation Memory

1497±11No.7

Creative

1444±13No.22

Coding

1527±10No.8

Math

1514±18🥇 No.1

Arena by Occupation

Creative Writing

1467±10No.8

Social Sciences

1480±12No.30

Media

1448±12No.15

Business

1483±11No.10

Healthcare

1471±19No.42

Legal

1476±18No.26

Software

1510±8No.16

Mathematics

1516±20No.5

Source:Arena Intelligence

Overall

AA Intelligence Index

57%↑18%

LiveBench

81%↑20%

ForecastBench

59%↓1%

Reasoning & Math

GPQA Diamond

92%↑10%

HLE

42%↑24%

LB Reasoning

88%↑19%

LB Math

94%↑20%

LB Data

79%↑26%

Coding

AA Coding Index

57%↑21%

LB Coding

78%↑5%

LB Agentic

70%↑25%

TAU2

87%↑7%

TerminalBench

58%↑23%

SciCode

57%↑15%

Language & Instructions

IFBench

74%↑11%

AA-LCR

74%↑12%

Hallucination (HHEM)

7.0%↓3%

Factual (HHEM)

93%↑3%

LB Language

83%↑10%

LB IF

70%↑19%

Output Speed

Standard Mode

155tok/s↑78

First Output 0.49s

Reasoning Mode

158tok/s↑71

First Output 3.64s

Source:Artificial Analysis LiveBench ForecastBench Vectara HHEM

Multilingual Capabilities

MGSM 🇰🇷

94%

MGSM 🇯🇵

92%

KMMLU 🇰🇷

77%

JMMLU 🇯🇵

75%

OpenAI