Moonshot AI

Kimi K2.5

Name: Moonshot AI Kimi K2.5
Author: Moonshot AI

Try It Compare

Model ID:moonshotai/kimi-k2.5

2026-01-27

Try It Compare

Kimi K2.5 is Moonshot AI's native multimodal model released in January 2026, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15 trillion mixed visual and text tokens, it generates code from visual specifications — turning UI designs and video workflows into working implementations. Its agent swarm technology can self-direct up to 100 parallel sub-agents, each independently using tools to search, generate, analyze, and organize information, reducing execution time by up to 4.5× for complex research and writing tasks.

API|VisionReasoning|Open ModelModified MIT

Knowledge Cutoff

2026-02-02

The date this AI finished learning. It may not know about things that happened after this date.

Input → Output Format

The types of content this AI can receive, and what it can produce in return.

Context Memory

262KIN66KOUT

The maximum amount of text the AI can read and process in a single request. A larger number means it can handle longer documents or conversations.

Cost/1M Words

$0.44IN$2OUT

The cost of using this AI directly in your own application. Shown in USD per 1 million units of text (tokens).

Calculate Cost

Source:Official Docs OpenRouter

AI Performance Evaluation

Arena Overall Score

1450

±4

As of 2026-05-01

Overall Rank

No.40

26,123 Votes

Arena by Ability

Hard Prompts

1471±5No.42

Expert Knowledge

1486±14No.28

Instruction Following

1438±7No.40

Conversation Memory

1452±9No.46

Creative

1415±10No.48

Coding

1507±8No.30

Math

1474±14No.17

Arena by Occupation

Creative Writing

1424±8No.49

Social Sciences

1469±9No.37

Media

1421±9No.42

Business

1436±9No.57

Healthcare

1465±14No.53

Legal

1444±13No.62

Software

1492±6No.32

Mathematics

1480±16No.15

Source:Arena Intelligence

Overall

AA Intelligence Index

47%↑8%

LiveBench

69%↑8%

Reasoning & Math

GPQA Diamond

88%↑6%

HLE

29%↑12%

LB Reasoning

76%↑7%

LB Math

85%↑11%

LB Data

61%↑8%

Coding

AA Coding Index

40%↑3%

LB Coding

78%↑5%

LB Agentic

48%↑3%

TAU2

96%↑15%

TerminalBench

35%↑1%

SciCode

49%↑7%

Language & Instructions

IFBench

70%↑7%

AA-LCR

65%↑3%

Hallucination (HHEM)

14%↑4%

Factual (HHEM)

86%↓4%

LB Language

78%↑5%

LB IF

57%↑6%

Output Speed

Standard Mode

48tok/s↓30

First Output 1.26s

Reasoning Mode

46tok/s↓41

First Output 66.42s

Source:Artificial Analysis LiveBench Vectara HHEM

Moonshot AI