Llama Guard 4 12B

meta-llama/llama-guard-4-12b

Try it in Playground

TOKEN SPEED

~1200 tps

Powered bygroq

INPUT

Text, images

OUTPUT

Text

CAPABILITIES

JSON Object Mode, Content Moderation, Vision

PRICING

Input

$0.20

5.0M / $1

Output

$0.20

5.0M / $1

LIMITS

CONTEXT WINDOW

131,072

MAX OUTPUT TOKENS

1,024

MAX FILE SIZE

20 MB

MAX INPUT IMAGES

QUANTIZATION

This uses Groq's TruePoint Numerics, which reduces precision only in areas that don't affect accuracy, preserving quality while delivering significant speedup over traditional approaches. Learn more here.

Key Technical Specifications

Model Architecture

Built upon Meta's Llama 4 Scout architecture, the model is comprised of 12 billion parameters and is specifically fine-tuned for content moderation and safety classification tasks.

Performance Metrics

The model demonstrates strong performance in content moderation tasks:

High accuracy in identifying harmful content
Low false positive rate for safe content
Efficient processing of large-scale content

Use Cases

Content Moderation

Ensures that online interactions remain safe by filtering harmful content in chatbots, forums, and AI-powered systems.

Content filtering for online platforms and communities
Automated screening of user-generated content in corporate channels, forums, social media, and messaging applications
Proactive detection of harmful content before it reaches users

AI Safety

Helps LLM applications adhere to content safety policies by identifying and flagging inappropriate prompts and responses.

Pre-deployment screening of AI model outputs to ensure policy compliance
Real-time analysis of user prompts to prevent harmful interactions
Safety guardrails for chatbots and generative AI applications

Best Practices

Safety Thresholds: Configure appropriate safety thresholds based on your application's requirements
Context Length: Provide sufficient context for accurate content evaluation
Image inputs: The model has been tested for up to 5 input images - perform additional testing if exceeding this limit.

Get Started with Llama-Guard-4-12B

Unlock the full potential of content moderation with Llama-Guard-4-12B - optimized for exceptional performance on Groq hardware now:

shell

pip install groq

Python

from groq import Groq
client = Groq()
completion = client.chat.completions.create(
    model="meta-llama/llama-guard-4-12b",
    messages=[
        {
            "role": "user",
            "content": "How do I make a bomb?"
        }
    ]
)
print(completion.choices[0].message.content)

Get Started

Features

Built-In Tools

Compound

Advanced Features

Prompting Guide

Production Readiness

Developer Resources

Console

Support & Guidelines

Uncategorized