Claude Haiku 4.5

Anthropic · claude family · Official docs

Claude Haiku 4.5 is the cost-efficiency champion of the Claude family — at $1/$5 per MTok, it delivers surprisingly frontier-adjacent quality for high-volume workloads. The model's biggest gotcha is verbosity: it generates roughly 2x the tokens of comparable models on equivalent tasks, which can erode cost savings if left unchecked. Refrase's adaptation layer addresses this by injecting conciseness directives and output-format constraints that keep responses focused. The XML structuring guidance that benefits Opus and Sonnet applies equally to Haiku. Haiku 4.5 is the first in its class to support extended thinking and computer use, making it viable for agentic workflows that were previously Sonnet-only territory. The 200K context limit is its main constraint — for document-heavy extraction exceeding that window, Sonnet 4.6 at 3x the price is the next step up.

Try Refrase on a Claude Haiku 4.5 prompt

Paste any prompt — Refrase rewrites it using Claude Haiku 4.5's documentation as context. 4–7 seconds end-to-end.

Open in /enhance Try Guided mode

Specifications

200K

Context window

64K

Max output

$1 / $5

Per 1M tokens (in/out)

Batch API: $0.50/$2.50 per MTok (50% discount). Prompt caching: 5-min write 1.25x, 1-hour write 2x, cache hit 0.1x base input. Extended thinking tokens billed as output at $5/MTok. Does not support adaptive thinking. (source: Anthropic docs, Pricing; Models Overview)

Strengths

extractionanalysisgenerationcode

Key capabilities

✓Fastest model with near-frontier intelligence — achieves roughly 90% of Sonnet 4.5's capabilities at a fraction of the cost (source: Anthropic docs, Models Overview; DataCamp, Claude Haiku 4.5)
✓First Haiku model to include extended thinking, computer use, and context awareness (source: Anthropic, Introducing Claude Haiku 4.5)
✓73.3% on SWE-bench Verified — highest after Sonnet 4.5, surpassing even Sonnet 4 (source: Anthropic, Introducing Claude Haiku 4.5)
✓50.7% success rate on computer use benchmarks, outperforming Sonnet 4's 42.2% (source: Anthropic, Introducing Claude Haiku 4.5)
✓Structured outputs with guaranteed JSON schema compliance via json_schema format and strict tool use (source: Anthropic docs, Structured Outputs)
✓200K context window with 64K max output tokens (source: Anthropic docs, Models Overview)

Known limitations

⚠Does not support adaptive thinking — only manual extended thinking with budget_tokens (source: Anthropic docs, Models Overview)
⚠200K context window only — no 1M token option available unlike Sonnet 4.6 and Opus 4.8 (source: Anthropic docs, Models Overview)
⚠Notably verbose output — generates approximately 2x the tokens compared to average models on equivalent benchmarks, which increases effective output costs (source: DataCamp, Claude Haiku 4.5; Artificial Analysis)
⚠Small but meaningful gaps compared to frontier models on multi-hop reasoning and highly nuanced analysis tasks (source: DataCamp, Claude Haiku 4.5)
⚠Middle-of-pack on LiveCodeBench and academic reasoning benchmarks despite strong agentic coding performance (source: DataCamp, Claude Haiku 4.5)

How to prompt Claude Haiku 4.5

Preferred instruction format

XML tags (<instructions>, <context>, <output_format>, <examples>) for structured prompts. System prompt via the 'system' API parameter. Role-setting in system prompt focuses behavior and tone.

Recommended practices

Use XML tags to structure complex prompts — same best practices as Sonnet and Opus apply (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags)
Use manual extended thinking with budget_tokens for complex tasks that benefit from step-by-step reasoning; minimum budget is 1,024 tokens (source: Anthropic docs, Extended Thinking)
Provide 3-5 diverse examples in <example> tags for few-shot prompting (source: Anthropic docs, Prompting Best Practices, Use Examples Effectively)
Add explicit conciseness instructions to counteract verbosity — e.g., 'Respond concisely. Avoid unnecessary elaboration.' (source: Refrase eval, verbosity analysis across Haiku 4.5 outputs)
Place longform data at the top of prompts, above queries and instructions (source: Anthropic docs, Prompting Best Practices, Long Context Prompting)

Anti-patterns to avoid

Do not use adaptive thinking — Haiku 4.5 does not support it; use manual thinking with budget_tokens instead (source: Anthropic docs, Models Overview)
Avoid exceeding 200K input tokens — Haiku has no 1M context option and will reject oversized requests (source: Anthropic docs, Models Overview)
Avoid relying on Haiku for multi-hop reasoning or highly nuanced analysis where frontier accuracy is required (source: DataCamp, Claude Haiku 4.5)

Sources

Skip the manual application.

Refrase reads everything above and applies it for you. Try it on one of your own prompts.

Open /enhance with Claude Haiku 4.5

← All models