Skip to main content
← All models

Claude Haiku 4.5

Anthropic · claude family · Official docs

Claude Haiku 4.5 is the cost-efficiency champion of the Claude family — at $1/$5 per MTok, it delivers surprisingly frontier-adjacent quality for high-volume workloads. The model's biggest gotcha is verbosity: it generates roughly 2x the tokens of comparable models on equivalent tasks, which can erode cost savings if left unchecked. Refrase's adaptation layer addresses this by injecting conciseness directives and output-format constraints that keep responses focused. The XML structuring gains that benefit Opus and Sonnet apply equally to Haiku, with our eval showing a ~10% improvement from prompt adaptation. Haiku 4.5 is the first in its class to support extended thinking and computer use, making it viable for agentic workflows that were previously Sonnet-only territory. For judge/evaluator roles in multi-model pipelines, Haiku 4.5 offers excellent cost-performance with zero timeouts in our 46-model eval study (replacing DeepSeek V3.2 which had ~17% timeout rate). The 200K context limit is its main constraint — for document-heavy extraction exceeding that window, Sonnet 4.6 at 3x the price is the next step up.

Try Refrase on a Claude Haiku 4.5 prompt

Paste any prompt — Refrase rewrites it using Claude Haiku 4.5's documentation as context. 4–7 seconds end-to-end.

Specifications

200K
Context window
64K
Max output
$1 / $5
Per 1M tokens (in/out)
Batch API: $0.50/$2.50 per MTok (50% discount). Prompt caching: 5-min write 1.25x, 1-hour write 2x, cache hit 0.1x base input. Extended thinking tokens billed as output at $5/MTok. Does not support adaptive thinking. (source: Anthropic docs, Pricing; Models Overview)

Strengths

extractionanalysisgenerationcode

Key capabilities

  • Fastest model with near-frontier intelligence — achieves roughly 90% of Sonnet 4.5's capabilities at a fraction of the cost (source: Anthropic docs, Models Overview; DataCamp, Claude Haiku 4.5)
  • First Haiku model to include extended thinking, computer use, and context awareness (source: Anthropic, Introducing Claude Haiku 4.5)
  • 73.3% on SWE-bench Verified — highest after Sonnet 4.5, surpassing even Sonnet 4 (source: Anthropic, Introducing Claude Haiku 4.5)
  • 50.7% success rate on computer use benchmarks, outperforming Sonnet 4's 42.2% (source: Anthropic, Introducing Claude Haiku 4.5)
  • Structured outputs with guaranteed JSON schema compliance via json_schema format and strict tool use (source: Anthropic docs, Structured Outputs)
  • 200K context window with 64K max output tokens (source: Anthropic docs, Models Overview)

Known limitations

  • Does not support adaptive thinking — only manual extended thinking with budget_tokens (source: Anthropic docs, Models Overview)
  • 200K context window only — no 1M token option available unlike Sonnet 4.6 and Opus 4.6 (source: Anthropic docs, Models Overview)
  • Notably verbose output — generates approximately 2x the tokens compared to average models on equivalent benchmarks, which increases effective output costs (source: DataCamp, Claude Haiku 4.5; Artificial Analysis)
  • Small but meaningful gaps compared to frontier models on multi-hop reasoning and highly nuanced analysis tasks (source: DataCamp, Claude Haiku 4.5)
  • Middle-of-pack on LiveCodeBench and academic reasoning benchmarks despite strong agentic coding performance (source: DataCamp, Claude Haiku 4.5)

How to prompt Claude Haiku 4.5

Preferred instruction format

XML tags (<instructions>, <context>, <output_format>, <examples>) for structured prompts. System prompt via the 'system' API parameter. Role-setting in system prompt focuses behavior and tone.

Recommended practices

  • Use XML tags to structure complex prompts — same best practices as Sonnet and Opus apply (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags)
  • Use manual extended thinking with budget_tokens for complex tasks that benefit from step-by-step reasoning; minimum budget is 1,024 tokens (source: Anthropic docs, Extended Thinking)
  • Provide 3-5 diverse examples in <example> tags for few-shot prompting (source: Anthropic docs, Prompting Best Practices, Use Examples Effectively)
  • Add explicit conciseness instructions to counteract verbosity — e.g., 'Respond concisely. Avoid unnecessary elaboration.' (source: Refrase eval, verbosity analysis across Haiku 4.5 outputs)
  • Place longform data at the top of prompts, above queries and instructions (source: Anthropic docs, Prompting Best Practices, Long Context Prompting)

Anti-patterns to avoid

  • Do not use adaptive thinking — Haiku 4.5 does not support it; use manual thinking with budget_tokens instead (source: Anthropic docs, Models Overview)
  • Avoid exceeding 200K input tokens — Haiku has no 1M context option and will reject oversized requests (source: Anthropic docs, Models Overview)
  • Avoid relying on Haiku for multi-hop reasoning or highly nuanced analysis where frontier accuracy is required (source: DataCamp, Claude Haiku 4.5)

Sources

Skip the manual application.

Refrase reads everything above and applies it for you. Try it on one of your own prompts.