MiniMax M2
MiniMax · minimax family · Official docs
MiniMax M2 is a Chinese-origin model optimized for coding and agentic workflows. The 230B/10B MoE architecture is one of the most aggressively sparse designs — only 4.3% of parameters active per token. The interleaved thinking pattern with <think> tags is a key differentiator — unlike models with separate thinking modes, M2 weaves reasoning into generation naturally. The XML-based tool calling format (<minimax:tool_call>) is unique and requires specific parser support. The model's MIT license and competitive benchmarks make it attractive for open-source deployments. Successors M2.1 (enhanced multilingual coding) and M2.5 (agent swarm) build on this base. The 200K context / 128K output combination matches GLM-4.7 Flash and exceeds most competitors.
Specifications
Strengths
Key capabilities
- ✓230B total parameters with only 10B active — extremely compact MoE (source: GitHub, README)
- ✓Interleaved thinking with <think>...</think> tags for chain-of-thought reasoning (source: GitHub, README)
- ✓Strong coding: SWE-bench Verified 69.4%, LiveCodeBench 83%, Terminal-Bench 46.3% (source: GitHub, README)
- ✓General intelligence: MMLU-Pro 82%, BrowseComp 44% (source: GitHub, README)
- ✓Native tool calling with XML format <minimax:tool_call> tags (source: Hugging Face, Tool Calling Guide)
- ✓Plans and executes complex long-horizon toolchains across shell, browser, retrieval, and code runners (source: GitHub, README)
- ✓Multi-file editing and code-run-fix loops (source: GitHub, README)
- ✓200K context window with 128K output capacity (source: MiniMax API Docs)
- ✓MIT License — fully open source (source: GitHub, README)
Known limitations
- ⚠Interleaved thinking content in <think> tags must be preserved in conversation history — removing them degrades performance (source: GitHub, README)
- ⚠10B active parameters may limit depth of reasoning on highly complex tasks vs larger dense models (source: Architecture analysis)
- ⚠Training details not publicly disclosed (source: GitHub, README — absent from documentation)
- ⚠Newer M2.1 and M2.5 versions available — M2 may receive fewer updates (source: MiniMax, Product Timeline)
- ⚠XML-based tool calling format (<minimax:tool_call>) requires custom parsing if not using vLLM/SGLang built-in parsers (source: Hugging Face, Tool Calling Guide)
- ⚠Released October 2025 — younger model with less community ecosystem than established alternatives (source: llm-stats.com)
How to prompt MiniMax M2
Preferred instruction format
Standard OpenAI-compatible chat format. Default system prompt: 'You are a helpful assistant. Your name is MiniMax-M2 and is built by MiniMax.' Tool definitions injected as structured text with XML format instructions.
Recommended practices
- Use temperature=1.0, top_p=0.95, top_k=40 for best performance (source: GitHub, README)
- Preserve <think>...</think> tags in conversation history — removing them degrades multi-turn performance (source: GitHub, README)
- Give the model a role, constraints, and acceptance tests in system prompt — structure beats cleverness (source: Skywork.ai, Prompt Optimization)
- Let the model think in steps and invite self-critique for complex tasks (source: Skywork.ai, Prompt Optimization)
- Use vLLM or SGLang with built-in parsers for automatic tool call handling (source: Hugging Face, Tool Calling Guide)
- For manual tool call parsing, use XML regex to extract <minimax:tool_call> blocks (source: Hugging Face, Tool Calling Guide)
- Return tool results with role='tool' and structured content array format (source: Hugging Face, Tool Calling Guide)
Anti-patterns to avoid
- Do NOT strip <think>...</think> tags from conversation history — model relies on them for coherent multi-turn reasoning (source: GitHub, README)
- Do NOT parse tool calls without schema type information — parameters need type-aware conversion (source: Hugging Face, Tool Calling Guide)
- Do NOT skip adding tool results back to conversation history — breaks iterative tool calling (source: Hugging Face, Tool Calling Guide)
- Do NOT assume string encoding for all tool call parameters — use schema type definitions for proper conversion (source: Hugging Face, Tool Calling Guide)