Skip to main content
← All Models

GPT-4o mini

OpenAI · openai family · Official Docs

GPT-4o mini is OpenAI's cost-optimized model, ideal for high-volume classification, extraction, and summarization tasks where GPT-4o's full capabilities are unnecessary. At $0.15/1M input tokens it is the cheapest capable model in OpenAI's lineup. Key differentiation from Claude Haiku: GPT-4o mini has a larger context window (128K vs Haiku's 200K) but lower reasoning quality. From Llama: GPT-4o mini offers structured output guarantees that open-source Llama models cannot match without additional tooling. For Refrase, the GPT-4o-mini adapter should use the same markdown-structured prompts as GPT-4o but with more explicit instructions and heavier use of few-shot examples. The structured output strict mode is critical for this model to prevent format drift. Its successor GPT-4.1-mini is significantly better on all dimensions but 2.7x more expensive on input -- the tradeoff is context window (8x larger) and output quality. Note: GPT-4o mini does NOT support reasoning/thinking mode -- for tasks requiring internal reasoning, use o4-mini instead.

#10
Rank
85
Quality Score
400ms
Avg Response
+6%
Adaptation Gain

Specifications

128K
Context Window
16K
Max Output
$0.15 / $0.6
Per 1M tokens (in/out)

Key Capabilities

  • Cost-efficient small model at $0.15/1M input tokens -- more than 60% cheaper than GPT-3.5 Turbo while exceeding its quality (source: OpenAI GPT-4o mini Announcement, Pricing)
  • Supports text and vision inputs with text outputs; multimodal reasoning on images (source: OpenAI GPT-4o mini Announcement, Capabilities)
  • Structured Outputs with strict JSON schema enforcement, same as GPT-4o (source: OpenAI Structured Outputs Guide, Supported Models -- gpt-4o-mini-2024-07-18)
  • 128K context window matching GPT-4o for long document processing (source: OpenAI Models Page, GPT-4o mini)
  • MMLU score of 82%, ranking higher than GPT-4 on chat preference evaluations (source: OpenAI GPT-4o mini Announcement; llm-stats.com, Benchmark Scores)
  • Function/tool calling support with the same strict:true schema enforcement (source: OpenAI Structured Outputs Guide, Function Calling)
  • Logprobs and top_logprobs support for confidence scoring and token analysis (source: OpenRouter GPT-4o-mini Page, Features)

Known Limitations

  • Knowledge cutoff of October 2023 -- 8 months behind GPT-4o's June 2024 cutoff (source: llm-stats.com GPT-4o mini page, Knowledge Cutoff; OpenAI Models Page)
  • SWE-bench Verified score of only 8.7%, drastically lower than GPT-4o's 33.2% -- not suitable for complex autonomous coding tasks (source: llm-stats.com GPT-4o mini, Benchmark Scores)
  • Superseded by GPT-4.1 mini which is 2.7x more expensive on input ($0.40 vs $0.15) but offers 1M context, 32K output, and significantly better benchmarks across the board (source: OpenAI GPT-4.1 Announcement; llm-stats.com GPT-4.1 mini)
  • Does not support reasoning mode -- no internal chain-of-thought like o1/o3/o4-mini models (source: OpenRouter GPT-4o-mini Page, Features)
  • Lower quality on complex reasoning and math tasks compared to full GPT-4o: MATH score 70.2% vs GPT-4o's higher performance (source: llm-stats.com GPT-4o mini, Benchmark Scores)

Prompt Patterns

Preferred Instruction Format

Same role-based chat completion format as GPT-4o with 'system' role messages. System messages embedded as first message in the messages array. All GPT-4.1 prompting guide practices apply to GPT-4o mini as well, though the model is less capable at complex instruction following. (source: OpenAI GPT-4.1 Prompting Guide; OpenAI Models Page)

Recommended Practices

  • Use structured system prompts with clear sections: Role, Instructions, Output Format, Examples -- same patterns as GPT-4o (source: OpenAI GPT-4.1 Prompting Guide, System Message Structure)
  • Leverage few-shot examples more heavily than with GPT-4o, as the smaller model benefits more from demonstrated patterns (source: OpenAI Help Center, Prompt Engineering Best Practices, Few-Shot Learning)
  • Use markdown headers and delimiters to separate prompt sections clearly -- helps the smaller model parse structure (source: OpenAI GPT-4.1 Prompting Guide, Delimiter Conventions)
  • Keep prompts more explicit and less ambiguous than you would for GPT-4o -- the mini model infers intent less reliably (source: OpenAI GPT-4.1 Prompting Guide, Instruction Hierarchy -- applies proportionally to smaller models)
  • Use structured outputs (strict JSON schema) to guarantee output format compliance -- especially important for smaller models prone to format drift (source: OpenAI Structured Outputs Guide, Introduction)
  • Optimize for caching by placing static system instructions and examples before variable user content (source: OpenAI Prompt Engineering Guide, Caching Strategy)

Anti-Patterns to Avoid

  • Do NOT rely on GPT-4o mini for complex multi-step reasoning without explicit step-by-step decomposition in the prompt (source: OpenAI Help Center, Prompt Engineering Best Practices)
  • Do NOT use for autonomous agentic workflows requiring complex tool orchestration -- SWE-bench score of 8.7% indicates poor agentic capability (source: llm-stats.com GPT-4o mini Benchmarks)
  • Same anti-patterns as GPT-4o apply: avoid JSON context wrapping, avoid manual tool schema injection, avoid sample phrase repetition without variation instruction (source: OpenAI GPT-4.1 Prompting Guide, Common Anti-Patterns)

What Refrase Does

Here is exactly how Refrase optimizes prompts for GPT-4o mini, rule by rule:

Grounding rules

Refrase injects grounding constraints that tell the model to only use information from the provided context, reducing hallucination and fabricated details.

JSON reinforcement

Refrase adds explicit JSON schema hints and formatting rules so the model produces valid, parseable JSON output without extra markdown or commentary.

Before / After

See how Refrase transforms a generic prompt for GPT-4o mini.

Original

Extract the key information from this document. Be accurate.

Adapted for GPT-4o mini

Extract the key information from this document.
Only use information explicitly present in the provided document. Do not infer or fabricate details.
Return valid JSON only. No markdown fences, no commentary outside the JSON object.

Try It

Your prompt134 chars
Optimized for gpt-4o-mini

Click "Refrase It" or select a model to see the optimized prompt.