Documentation
DeepSeek-R1-Distill-Llama-70B
DeepSeek-R1-Distill-Llama-70B is a distilled version of DeepSeek's R1 model, fine-tuned from the Llama-3.3-70B-Instruct base model. This model leverages knowledge distillation to retain robust reasoning capabilities while enhancing efficiency.
Key Technical Specifications
-
Model Architecture: Built upon the Llama-3.3-70B-Instruct framework, the model comprises 70 billion parameters. The distillation process fine-tunes the base model using outputs from DeepSeek-R1, effectively transferring reasoning patterns.
-
Performance Metrics : The model demonstrates strong performance across various benchmarks:
- AIME 2024: Pass@1 score of 70.0.
- MATH-500: Pass@1 score of 94.5.
- CodeForces Rating: Achieved a rating of 1,633.
Technical Details Table
Detail | Value |
---|---|
Context Window (Tokens) | 128k |
Max Output Tokens | - |
Max File Size | - |
Token Generation Speed (as of 2025-01-28) | 275 tps |
Pricing | Pricing Details |
Capabilities and Features
DeepSeek-R1-Distill-Llama-70B excels in the following areas:
Supported Features
Feature | Supported |
---|---|
Tool Use | ✅ |
JSON Mode | ✅ |
Image Support | ❌ |
Use Cases
- Mathematical Problem-Solving: Effectively addresses complex mathematical queries, making it valuable for educational tools and research applications.
- Coding Assistance: Supports code generation and debugging, beneficial for software development.
- Logical Reasoning: Performs tasks requiring structured thinking and deduction, applicable in data analysis and strategic planning.
Best Practices
- Prompt Engineering: Set the temperature parameter between 0.5 and 0.7 (ideally 0.6) to prevent repetitive or incoherent outputs.
- System Prompt: Avoid adding a system prompt and include all instructions within the user prompt.
Get Started with DeepSeek-R1-Distill-Llama-70B
Unlock the full potential of logical reasoning with DeepSeek-R1-Distill-Llama-70B - engineered for the future of AI-driven problem-solving and optimized for exceptional performance on Groq hardware with near-instant reasoning now:
Install Groq and Perform Chat Completion Using Python:
pip install groq
1import os
2
3from groq import Groq
4
5client = Groq(
6 api_key=os.environ.get("GROQ_API_KEY"),
7)
8
9chat_completion = client.chat.completions.create(
10 messages=[
11 {
12 "role": "user",
13 "content": "Explain why fast inference is critical for reasoning models",
14 }
15 ],
16 model="deepseek-r1-distill-llama-70b",
17)
18
19print(chat_completion.choices[0].message.content)