Kimi K2 Instruct

Preview
moonshotai/kimi-k2-instruct
Try it in Playground
TOKEN SPEED
~250 TPS
INPUT
Text
OUTPUT
Text
CAPABILITIES
Tool Use, JSON Mode
Moonshot AI logoMoonshot AI
model card

Kimi K2 is Moonshot AI's state-of-the-art Mixture-of-Experts (MoE) language model with 1 trillion total parameters and 32 billion activated parameters. Designed for agentic intelligence, it excels at tool use, coding, and autonomous problem-solving across diverse domains.


PRICING

Input
$1.00
1.0M / $1
Output
$3.00
333,333 / $1

LIMITS

CONTEXT WINDOW
131,072

MAX OUTPUT TOKENS
16,384

Key Technical Specifications

Model Architecture

Built on a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters. Features 384 experts with 8 experts selected per token, optimized for efficient inference while maintaining high performance. Trained with the innovative Muon optimizer to achieve zero training instability.

Performance Metrics

The Kimi-K2-Instruct model demonstrates exceptional performance across coding, math, and reasoning benchmarks:
  • LiveCodeBench: 53.7% Pass@1 (top-tier coding performance)
  • SWE-bench Verified: 65.8% single-attempt accuracy
  • MMLU (Massive Multitask Language Understanding): 89.5% exact match
  • Tau2 retail tasks: 70.6% Avg@4

Use Cases

Agentic AI and Tool Use
Leverage the model's advanced tool calling capabilities for building autonomous agents that can interact with external systems and APIs.
Advanced Code Generation
Utilize the model's top-tier performance in coding tasks, from simple scripting to complex software development and debugging.
Complex Problem Solving
Deploy for multi-step reasoning tasks, mathematical problem-solving, and analytical workflows requiring deep understanding.
Multilingual Applications
Take advantage of strong multilingual capabilities for global applications and cross-language understanding tasks.

Best Practices

  • Provide clear, detailed tool and function definitions with explicit parameters, expected outputs, and constraints for optimal tool use performance.
  • Structure complex tasks into clear steps to leverage the model's agentic reasoning capabilities effectively.
  • Use the full 128K context window for complex, multi-step workflows and comprehensive documentation analysis.
  • Leverage the model's multilingual capabilities by clearly specifying the target language and cultural context when needed.

Get Started with Kimi K2

Experience moonshotai/kimi-k2-instruct on Groq:

shell
pip install groq
Python
from groq import Groq
client = Groq()
completion = client.chat.completions.create(
    model="moonshotai/kimi-k2-instruct",
    messages=[
        {
            "role": "user",
            "content": "Explain why fast inference is critical for reasoning models"
        }
    ]
)
print(completion.choices[0].message.content)

Was this page helpful?