Skip to main content
← All models

Mistral Large 3

Mistral · mistral family · Official docs

Mistral Large 3 is the European frontier contender and the first truly open-weight model at the 675B scale. Its 256K context window is a standout feature, tied with Claude for the largest commercially available context. The function calling and structured output support is native and well-tested — Mistral has invested heavily in agentic capabilities. For Refrase users in regulated European industries, Mistral offers a GDPR-friendly alternative to US and Chinese providers. The Apache 2.0 licensing is a strategic differentiator: this is the only 675B-class model you can legally self-host and modify. The anti-patterns guidance from Mistral's docs (avoid vague quantifiers, prefer worded scales) is unusually practical and directly applicable to Refrase prompt optimization.

Try Refrase on a Mistral Large 3 prompt

Paste any prompt — Refrase rewrites it using Mistral Large 3's documentation as context. 4–7 seconds end-to-end.

Specifications

262K
Context window
262K
Max output
$0.5 / $1.5
Per 1M tokens (in/out)
Mistral API (La Plateforme) pricing. Also available on AWS Bedrock, Azure, and GCP at provider-specific pricing. Open-weight Apache 2.0 — self-hosting eliminates API costs but requires 8xH200 or equivalent for FP8 deployment. (source: Mistral Docs, Mistral Large 3 page)

Strengths

extractionanalysis

Key capabilities

  • Granular Mixture-of-Experts: 675B total parameters with 41B active (Language model: 673B/39B active, Vision encoder: 2.5B) (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • 256K context window — one of the largest available, supporting extensive document processing and complex agentic workflows (source: Mistral Docs, Mistral Large 3 page)
  • Multimodal: native vision capabilities for image analysis with up to 10 images per prompt (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Native function calling and structured JSON output — best-in-class agentic capabilities with tool-use support (source: Mistral Docs, Mistral Large 3 page)
  • Broad multilingual support: English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic, and more (source: Mistral Docs, Mistral Large 3 page)
  • Open-weight Apache 2.0 license — fully open-sourced, marking Mistral's first frontier-class model under permissive license (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Fill-in-the-Middle (FIM) for code generation, OCR with structured annotations, and audio transcription (source: Mistral Docs, Mistral Large 3 page)
  • Trained from scratch on ~3,000 H200 GPUs — frontier-level training compute (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)

Known limitations

  • Not a dedicated reasoning model — specialized reasoning models (like Magistral or Claude with extended thinking) may outperform on pure reasoning tasks (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Lags behind vision-first models on multimodal-optimized benchmarks despite supporting vision (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Extremely resource-intensive deployment: requires 8xH200 (FP8) or 8xH100/A100 (NVFP4) for inference (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Image aspect ratio sensitivity: best performance near 1:1 width-to-height ratio (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)

How to prompt Mistral Large 3

Preferred instruction format

Standard chat format with system/user/assistant roles. System messages define environment and guidance. Mistral recommends defining roles explicitly: 'You are a [role], your task is to [task].' Supports Markdown and XML-style tags for structured instructions.

Recommended practices

  • Use temperature < 0.1 for production/daily-driver use cases; higher temperature for creative tasks (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Define roles explicitly with 'You are a [role], your task is to [task]' framing (source: Mistral Docs, Prompting Capabilities)
  • Organize instructions hierarchically with clear sections and subsections using Markdown or XML tags (source: Mistral Docs, Prompting Capabilities)
  • Use few-shot prompting with example user-assistant exchanges for format-critical tasks (source: Mistral Docs, Prompting Capabilities)
  • Keep tool definitions minimal — avoid overloading the model with too many function definitions (source: Hugging Face, Mistral-Large-3-675B-Instruct-2512 Model Card)
  • Treat prompts like code: iterate, test, and evaluate changes systematically (source: Mistral Docs, Prompting Capabilities)

Anti-patterns to avoid

  • Avoid vague quantifiers ('too long', 'many', 'few') — use objective measurements (source: Mistral Docs, Prompting Capabilities)
  • Never use ambiguous descriptors like 'interesting' or 'better' without precise definitions (source: Mistral Docs, Prompting Capabilities)
  • Do not use contradictory rules — use decision trees for conditional logic (source: Mistral Docs, Prompting Capabilities)
  • Avoid asking the model to count characters or items — provide counts directly (source: Mistral Docs, Prompting Capabilities)
  • Prefer worded scales ('Very Low', 'Good') over numeric scales (1-5) for evaluation tasks (source: Mistral Docs, Prompting Capabilities)

Sources

Compare prompting style with another model

Skip the manual application.

Refrase reads everything above and applies it for you. Try it on one of your own prompts.