Anthropic

Claude Sonnet 4

Name: Anthropic Claude Sonnet 4
Author: Anthropic

Compare

Model ID:claude-sonnet-4-20250514

2025-05-22

Compare

Claude Sonnet 4 is Anthropic's balanced mid-tier model released alongside Opus 4 in May 2025, designed to combine strong coding and reasoning capabilities with computational efficiency. It achieves state-of-the-art 72.7% on SWE-bench while offering significantly lower cost and faster response times than Opus models. Key strengths include autonomous codebase navigation, reduced error rates in agent-driven workflows, and high reliability in following intricate instructions, making it a versatile choice for both routine and complex development tasks.

API|VisionReasoningWeb SearchFile|Proprietary Model

Knowledge Cutoff

2025-01-31

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

1MIN64KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$3IN$15OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1399

±4

As of 2026-05-01

Overall Rank

No.109

35,139 Votes

Arena by Ability

Hard Prompts

1431±6No.93

Expert Knowledge

1433±15No.87

Instruction Following

1414±7No.75

Conversation Memory

1420±8No.82

Creative

1395±9No.70

Coding

1473±8No.67

Math

1402±13No.103

Arena by Occupation

Creative Writing

1397±7No.85

Social Sciences

1418±8No.105

Media

1389±8No.83

Business

1384±8No.125

Healthcare

1419±13No.112

Legal

1409±13No.103

Software

1443±6No.95

Mathematics

1410±13No.103

Source:Arena Intelligence

Overall

AA Intelligence Index

39%↑0%

LiveBench

61%↑0%

ForecastBench

59%↑0%

Reasoning & Math

AA Math Index

74%↑0%

GPQA Diamond

78%↓4%

HLE

9.6%↓8%

MMLU-Pro

84%↑3%

AIME 2025

74%↑0%

MATH-500

99%↑6%

LB Reasoning

69%↑0%

LB Math

71%↓4%

LB Data

55%↑1%

Coding

AA Coding Index

34%↓2%

LiveCodeBench

66%↑0%

LB Coding

77%↑5%

LB Agentic

40%↓5%

TAU2

65%↓16%

TerminalBench

31%↓3%

SciCode

40%↓2%

Language & Instructions

IFBench

55%↓8%

AA-LCR

65%↑3%

Hallucination (HHEM)

10%↑0%

Factual (HHEM)

90%↑0%

LB Language

73%↑1%

LB IF

44%↓7%

Output Speed

Standard Mode

45tok/s↓32

First Output 0.80s

Reasoning Mode

49tok/s↓38

First Output 10.65s

Source:Artificial Analysis LiveBench ForecastBench Vectara HHEM

Anthropic