Claude Haiku 4.5
Anthropic · claude family · Official Docs
Claude Haiku 4.5 is the cost-efficiency champion of the Claude family — at $1/$5 per MTok, it delivers surprisingly frontier-adjacent quality for high-volume workloads. The model's biggest gotcha is verbosity: it generates roughly 2x the tokens of comparable models on equivalent tasks, which can erode cost savings if left unchecked. Refrase's adaptation layer addresses this by injecting conciseness directives and output-format constraints that keep responses focused. The XML structuring gains that benefit Opus and Sonnet apply equally to Haiku, with our eval showing a ~10% improvement from prompt adaptation. Haiku 4.5 is the first in its class to support extended thinking and computer use, making it viable for agentic workflows that were previously Sonnet-only territory. For judge/evaluator roles in multi-model pipelines, Haiku 4.5 offers excellent cost-performance with zero timeouts in our 46-model eval study (replacing DeepSeek V3.2 which had ~17% timeout rate). The 200K context limit is its main constraint — for document-heavy extraction exceeding that window, Sonnet 4.6 at 3x the price is the next step up.
Specifications
Key Capabilities
- ✓Fastest model with near-frontier intelligence — achieves roughly 90% of Sonnet 4.5's capabilities at a fraction of the cost (source: Anthropic docs, Models Overview; DataCamp, Claude Haiku 4.5)
- ✓First Haiku model to include extended thinking, computer use, and context awareness (source: Anthropic, Introducing Claude Haiku 4.5)
- ✓73.3% on SWE-bench Verified — highest after Sonnet 4.5, surpassing even Sonnet 4 (source: Anthropic, Introducing Claude Haiku 4.5)
- ✓50.7% success rate on computer use benchmarks, outperforming Sonnet 4's 42.2% (source: Anthropic, Introducing Claude Haiku 4.5)
- ✓Structured outputs with guaranteed JSON schema compliance via json_schema format and strict tool use (source: Anthropic docs, Structured Outputs)
- ✓200K context window with 64K max output tokens (source: Anthropic docs, Models Overview)
Known Limitations
- ⚠Does not support adaptive thinking — only manual extended thinking with budget_tokens (source: Anthropic docs, Models Overview)
- ⚠200K context window only — no 1M token option available unlike Sonnet 4.6 and Opus 4.6 (source: Anthropic docs, Models Overview)
- ⚠Notably verbose output — generates approximately 2x the tokens compared to average models on equivalent benchmarks, which increases effective output costs (source: DataCamp, Claude Haiku 4.5; Artificial Analysis)
- ⚠Small but meaningful gaps compared to frontier models on multi-hop reasoning and highly nuanced analysis tasks (source: DataCamp, Claude Haiku 4.5)
- ⚠Middle-of-pack on LiveCodeBench and academic reasoning benchmarks despite strong agentic coding performance (source: DataCamp, Claude Haiku 4.5)
Prompt Patterns
Preferred Instruction Format
XML tags (<instructions>, <context>, <output_format>, <examples>) for structured prompts. System prompt via the 'system' API parameter. Role-setting in system prompt focuses behavior and tone.
Recommended Practices
- Use XML tags to structure complex prompts — same best practices as Sonnet and Opus apply (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags)
- Use manual extended thinking with budget_tokens for complex tasks that benefit from step-by-step reasoning; minimum budget is 1,024 tokens (source: Anthropic docs, Extended Thinking)
- Provide 3-5 diverse examples in <example> tags for few-shot prompting (source: Anthropic docs, Prompting Best Practices, Use Examples Effectively)
- Add explicit conciseness instructions to counteract verbosity — e.g., 'Respond concisely. Avoid unnecessary elaboration.' (source: Refrase eval, verbosity analysis across Haiku 4.5 outputs)
- Place longform data at the top of prompts, above queries and instructions (source: Anthropic docs, Prompting Best Practices, Long Context Prompting)
Anti-Patterns to Avoid
- Do not use adaptive thinking — Haiku 4.5 does not support it; use manual thinking with budget_tokens instead (source: Anthropic docs, Models Overview)
- Avoid exceeding 200K input tokens — Haiku has no 1M context option and will reject oversized requests (source: Anthropic docs, Models Overview)
- Avoid relying on Haiku for multi-hop reasoning or highly nuanced analysis where frontier accuracy is required (source: DataCamp, Claude Haiku 4.5)
Sources
- https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices
- https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/use-xml-tags
- https://platform.claude.com/docs/en/build-with-claude/extended-thinking
- https://platform.claude.com/docs/en/about-claude/models/overview
- https://platform.claude.com/docs/en/about-claude/pricing
- https://www.anthropic.com/news/claude-haiku-4-5
What Refrase Does
Here is exactly how Refrase optimizes prompts for Claude Haiku 4.5, rule by rule:
Before / After
See how Refrase transforms a generic prompt for Claude Haiku 4.5.
Try It
Click "Refrase It" or select a model to see the optimized prompt.