Magistral Medium
Mistral · mistral family · Official Docs
Magistral Medium 1.2 is Mistral's answer to the reasoning model trend — a dedicated chain-of-thought model with explicit [THINK]/[/THINK] tokens that make reasoning traces inspectable and debuggable. This transparency is a significant advantage for Refrase's prompt optimization workflow: users can see exactly how the model reasons and tune prompts to guide that reasoning. The 131K max output tokens is generous for reasoning-heavy tasks. The key limitation is the proprietary nature — unlike Mistral Large 3, this model cannot be self-hosted, locking users into Mistral's API. The 40K optimal context threshold is notably lower than the 128K advertised maximum, so Refrase should flag this discrepancy in recommendations. At $2/$5 per million tokens, it sits in the premium tier — best reserved for high-value reasoning tasks where the thinking trace visibility justifies the cost premium over alternatives.
Specifications
Key Capabilities
- ✓Frontier-class reasoning model with dedicated [THINK] and [/THINK] tokens for chain-of-thought traces (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
- ✓Multimodal: visual encoder for image analysis and understanding (source: Mistral Docs, Magistral Medium 1.2 page)
- ✓Native function calling, structured outputs, and agent conversations with built-in tools (source: Mistral Docs, Magistral Medium 1.2 page)
- ✓Document AI: OCR with structured annotations, bounding box extraction, and document Q&A (source: Mistral Docs, Magistral Medium 1.2 page)
- ✓Audio transcription with timestamps (source: Mistral Docs, Magistral Medium 1.2 page)
- ✓Fill-in-the-Middle (FIM) for code generation and predicted outputs (source: Mistral Docs, Magistral Medium 1.2 page)
- ✓25+ language support including French, German, Arabic, Japanese, and Chinese (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
Known Limitations
- ⚠Proprietary model — weights not publicly available, no self-hosting option (unlike Mistral Large 3 and Magistral Small) (source: Mistral Docs, Models overview page)
- ⚠Context window optimal performance under 40K tokens despite 128K maximum — quality may degrade at longer contexts (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
- ⚠Exact parameter count and architecture undisclosed — limited transparency compared to open-weight alternatives (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
- ⚠Premium pricing ($2/1M input, $5/1M output) — 4x the cost of Mistral Large 3 input and 3.3x output, reflecting reasoning compute overhead (source: Mistral Docs, Models overview page)
Prompt Patterns
Preferred Instruction Format
Standard chat format with system/user/assistant roles. Reasoning traces enclosed in [THINK] and [/THINK] tokens for developer inspection. API model name: magistral-medium-2509 or magistral-medium-latest.
Recommended Practices
- Use temperature=0.7, top_p=0.95, max_tokens=131072 as recommended sampling parameters (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
- Leverage [THINK]/[/THINK] tokens to inspect reasoning traces for debugging and quality assurance (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
- Use for reasoning-intensive tasks where chain-of-thought quality matters more than raw speed (source: Mistral Docs, Models overview page)
- Apply the same Mistral prompt engineering best practices: explicit role definition, hierarchical structure, Markdown/XML formatting (source: Mistral Docs, Prompting Capabilities)
- Keep context under 40K tokens for optimal reasoning quality (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
Anti-Patterns to Avoid
- Do not use for simple tasks where reasoning overhead is unnecessary — Mistral Large 3 or smaller models are more cost-effective (source: Mistral Docs, Models overview page)
- Avoid same anti-patterns as Mistral Large 3: vague quantifiers, ambiguous descriptors, contradictory rules, numeric scales (source: Mistral Docs, Prompting Capabilities)
- Do not exceed 40K tokens of context if reasoning quality is critical — performance is optimal below this threshold (source: apidog.com, 'Magistral Small 1.2 and Magistral Medium 1.2 are here')
What Refrase Does
Here is exactly how Refrase optimizes prompts for Magistral Medium, rule by rule:
Before / After
See how Refrase transforms a generic prompt for Magistral Medium.
Try It
Click "Refrase It" or select a model to see the optimized prompt.