Anthropic

Claude Opus 4.1

Name: Anthropic Claude Opus 4.1
Author: Anthropic

Compare

Model ID:claude-opus-4-1-20250805

2025-08-05

Compare

Claude Opus 4.1 is an updated version of Anthropic's flagship model released in August 2025, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains in multi-file code refactoring, debugging precision, and detail-oriented reasoning. The model supports extended thinking up to 64K tokens and is optimized for tasks involving research, data analysis, and tool-assisted reasoning workflows.

API|VisionReasoningWeb SearchFile|Proprietary Model

Knowledge Cutoff

2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

200KIN32KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$15IN$75OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1449

±3

As of 2026-05-01

Overall Rank

No.43

49,853 Votes

Arena by Ability

Hard Prompts

1480±5No.31

Expert Knowledge

1482±12No.33

Instruction Following

1459±6No.20

Conversation Memory

1473±7No.28

Creative

1444±8No.20

Coding

1513±7No.22

Math

1443±11No.43

Arena by Occupation

Creative Writing

1445±6No.31

Social Sciences

1471±7No.35

Media

1433±7No.28

Business

1448±7No.38

Healthcare

1479±12No.30

Legal

1463±11No.35

Software

1492±5No.34

Mathematics

1449±12No.44

Source:Arena Intelligence

Overall

AA Intelligence Index

42%↑3%

LiveBench

61%↑1%

ForecastBench

60%↑1%

Reasoning & Math

AA Math Index

80%↑6%

GPQA Diamond

81%↓1%

HLE

12%↓6%

MMLU-Pro

88%↑7%

AIME 2025

80%↑6%

LB Reasoning

72%↑3%

LB Math

73%↓1%

LB Data

49%↓4%

Coding

AA Coding Index

37%↑0%

LiveCodeBench

65%↑0%

LB Coding

75%↑2%

LB Agentic

48%↑3%

TAU2

71%↓9%

TerminalBench

34%↑0%

SciCode

41%↓1%

Language & Instructions

IFBench

55%↓8%

AA-LCR

66%↑4%

Hallucination (HHEM)

12%↑2%

Factual (HHEM)

88%↓2%

LB Language

73%↑0%

LB IF

42%↓9%

Output Speed

Standard Mode

34tok/s↓44

First Output 1.33s

Reasoning Mode

36tok/s↓50

First Output 10.14s

Source:Artificial Analysis LiveBench ForecastBench Vectara HHEM

Anthropic