xAI

Grok 4.20 (Reasoning)

Name: xAI Grok 4.20 (Reasoning)
Author: xAI

Compare

Model ID:grok-4.20-0309-reasoning

2026-03-31

Compare

Grok 4.20 (Reasoning) is the reasoning-enabled configuration of xAI's Grok 4.20, utilizing extended internal thinking to work through problems before presenting answers. Combined with the model's native multi-agent architecture and cross-agent verification, it delivers the highest accuracy in the Grok lineup on tasks requiring deep logic, mathematical reasoning, and complex multi-step problem solving. It supports the same 2M-token context window, strict prompt adherence, and the industry's lowest hallucination rate among its class.

xAI SuperGrok HeavyAPI|VisionReasoningWeb SearchFile|Proprietary Model

Knowledge Cutoff

Unknown

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

2MIN2MOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$1.25IN$2.5OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1480

±5

As of 2026-05-01

Overall Rank

No.9

17,413 Votes

Arena by Ability

Hard Prompts

1494±6No.15

Expert Knowledge

1473±16No.42

Instruction Following

1455±8No.25

Conversation Memory

1494±12No.10

Creative

1467±12No.8

Coding

1511±9No.24

Math

1461±17No.29

Arena by Occupation

Creative Writing

1457±10No.16

Social Sciences

1485±11No.19

Media

1455±11No.10

Business

1476±11No.14

Healthcare

1512±17No.6

Legal

1496±17No.9

Software

1509±8No.17

Mathematics

1461±19No.33

Source:Arena Intelligence

Overall

AA Intelligence Index

49%↑10%

LiveBench

69%↑8%

Reasoning & Math

GPQA Diamond

91%↑9%

HLE

32%↑15%

LB Reasoning

75%↑6%

LB Math

87%↑13%

LB Data

63%↑10%

Coding

AA Coding Index

41%↑4%

LB Coding

66%↓7%

LB Agentic

43%↓2%

TAU2

93%↑13%

TerminalBench

38%↑4%

SciCode

46%↑4%

Language & Instructions

IFBench

81%↑18%

AA-LCR

58%↓4%

LB Language

78%↑5%

LB IF

63%↑12%

Output Speed

Standard Mode

89tok/s↑11

First Output 0.53s

Reasoning Mode

91tok/s↑4

First Output 30.82s

Source:Artificial Analysis LiveBench

xAI