OpenAI

GPT-4.1

Name: OpenAI GPT-4.1
Author: OpenAI

Compare

Model ID:gpt-4.1-2025-04-14

2025-04-14

Compare

GPT-4.1 is OpenAI's flagship language model optimized for coding, instruction following, and long-context reasoning, released in April 2025. It supports a 1-million-token context window — over 8× the capacity of GPT-4o — and achieves 54.6% on SWE-bench Verified, representing a major improvement in real-world software engineering tasks. The model excels at precise code diffs, agent reliability, and high recall across large document contexts, making it well-suited for IDE tooling, automated coding agents, and enterprise knowledge retrieval.

API|VisionWeb SearchFile|Proprietary Model

Knowledge Cutoff

2024-06-30

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1.0MIN33KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$2IN$8OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1312

±4

As of 2026-05-01

Overall Rank

No.216

100,105 Votes

Arena by Ability

Hard Prompts

1311±6No.222

Expert Knowledge

1286±12No.215

Instruction Following

1294±6No.213

Conversation Memory

1298±8No.215

Creative

1285±8No.203

Coding

1338±7No.223

Math

1303±8No.192

Arena by Occupation

Creative Writing

1306±6No.197

Social Sciences

1321±8No.220

Media

1290±8No.191

Business

1282±9No.235

Healthcare

1305±12No.220

Legal

1317±11No.223

Software

1324±6No.230

Mathematics

1308±8No.194

Source:Arena Intelligence

Overall

AA Intelligence Index

26%↓13%

ForecastBench

59%↑0%

Reasoning & Math

AA Math Index

35%↓40%

GPQA Diamond

67%↓16%

HLE

4.6%↓13%

MMLU-Pro

81%↓1%

AIME 2025

35%↓40%

MATH-500

91%↓2%

Coding

AA Coding Index

22%↓15%

LiveCodeBench

46%↓20%

TAU2

47%↓33%

TerminalBench

14%↓20%

SciCode

38%↓4%

Language & Instructions

IFBench

43%↓20%

AA-LCR

61%↓1%

Hallucination (HHEM)

5.6%↓5%

Factual (HHEM)

94%↑4%

Output Speed

Standard Mode

111tok/s↑34

First Output 0.57s

Source:Artificial Analysis ForecastBench Vectara HHEM

OpenAI