Skip to main content
← All models

Claude Opus 4.6

Anthropic · claude family · Official docs

Claude Opus 4.6 is the most capable model in the Claude family and one of the strongest frontier models available. Its XML instruction-following is its most distinctive prompting behavior — our 46-configuration eval study confirmed 12-18% improvement when restructuring prompts with XML tags vs plain text. The model's adaptive thinking mode is a major differentiator: it dynamically allocates reasoning effort per step, which means prompt optimization must account for thinking budget as a variable rather than a constant. Opus 4.6 is more responsive to system prompts than any predecessor, so prompts migrated from older models often need de-escalation of imperative language to avoid overtriggering. The 1M context window at standard pricing makes it uniquely suited for document-heavy extraction tasks where competing models charge premiums for long context. For Refrase users, the key adaptation is converting markdown-structured prompts to XML-tagged prompts and ensuring the effort parameter is tuned appropriately — too-high effort inflates thinking tokens without proportional quality gains on simpler tasks.

Try Refrase on a Claude Opus 4.6 prompt

Paste any prompt — Refrase rewrites it using Claude Opus 4.6's documentation as context. 4–7 seconds end-to-end.

Specifications

1M
Context window
128K
Max output
$5 / $25
Per 1M tokens (in/out)
Batch API: $2.50/$12.50 per MTok (50% discount). Prompt caching: 5-min write 1.25x, 1-hour write 2x, cache hit 0.1x base input. Fast mode (research preview): $30/$150 per MTok (6x). US-only data residency adds 1.1x multiplier. (source: Anthropic docs, Pricing)

Strengths

extractionanalysisgenerationcode

Key capabilities

  • Most intelligent model for building agents and coding, with exceptional long-horizon reasoning and state tracking across extended sessions (source: Anthropic docs, Models Overview)
  • 1M token context window at standard pricing with 78.3% accuracy on 1M-token 8-needle retrieval test (source: Anthropic docs, Models Overview; Claude Opus 4.6 System Card)
  • 128K max output tokens, the highest of any Claude model (source: Anthropic docs, Models Overview)
  • Adaptive thinking mode where Claude dynamically decides when and how much to think, with interleaved thinking between tool calls (source: Anthropic docs, Extended Thinking)
  • Native subagent orchestration — proactively delegates work to specialized subagents without explicit instruction (source: Anthropic docs, Prompting Best Practices, Subagent Orchestration)
  • Structured outputs with guaranteed JSON schema compliance via json_schema format and strict tool use (source: Anthropic docs, Structured Outputs)
  • Improved vision capabilities for image processing, data extraction, screenshots, and UI element interpretation including computer use (source: Anthropic docs, Prompting Best Practices, Improved Vision Capabilities)
  • 68.8% on ARC-AGI-2 (nearly doubled from Opus 4.5's 37.6%) and 80.84% on SWE-bench Verified (source: Claude Opus 4.6 System Card)

Known limitations

  • Prefilled responses on the last assistant turn are no longer supported — must use structured outputs, tool calling, or explicit instructions instead (source: Anthropic docs, Prompting Best Practices, Migrating Away from Prefilled Responses)
  • Tendency to overengineer by creating extra files, adding unnecessary abstractions, or building unrequested flexibility; requires explicit guidance to keep solutions minimal (source: Anthropic docs, Prompting Best Practices, Overeagerness)
  • May take difficult-to-reverse actions (deleting files, force-pushing, posting to external services) without confirmation unless explicitly guided on safety boundaries (source: Anthropic docs, Prompting Best Practices, Balancing Autonomy and Safety)
  • Extended thinking with vision is not compatible (source: Anthropic docs, Extended Thinking, Constraints)
  • Strong predilection for spawning subagents even when simpler direct approaches would suffice, requiring explicit guidance to constrain (source: Anthropic docs, Prompting Best Practices, Subagent Orchestration)

How to prompt Claude Opus 4.6

Preferred instruction format

XML tags (<instructions>, <context>, <output_format>, <examples>) for structured prompts. System prompt via the 'system' API parameter. Role-setting in system prompt focuses behavior and tone.

Recommended practices

  • Use XML tags to structure complex prompts — wrap instructions, context, examples, and variable inputs in descriptive tags like <instructions>, <context>, <input> to reduce misinterpretation (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags)
  • Use adaptive thinking with effort parameter (low/medium/high/max) instead of manual budget_tokens; budget_tokens is deprecated on Opus 4.6 (source: Anthropic docs, Prompting Best Practices, Leverage Thinking; Extended Thinking)
  • Provide 3-5 diverse, relevant examples wrapped in <example> tags for few-shot prompting to dramatically improve accuracy and consistency (source: Anthropic docs, Prompting Best Practices, Use Examples Effectively)
  • Place longform data at the top of the prompt, above queries and instructions — queries at the end improve response quality by up to 30% in tests (source: Anthropic docs, Prompting Best Practices, Long Context Prompting)
  • Be explicit about desired behavior rather than relying on inference; tell Claude what to do instead of what not to do (source: Anthropic docs, Prompting Best Practices, Be Clear and Direct)
  • Dial back aggressive tool-triggering language from older prompts — Opus 4.6 is significantly more proactive and may overtrigger on instructions needed for previous models (source: Anthropic docs, Prompting Best Practices, Tool Usage)

Anti-patterns to avoid

  • Do not use prefilled assistant responses on the last turn — deprecated in Claude 4.6; use structured outputs or tool calling instead (source: Anthropic docs, Prompting Best Practices, Migrating Away from Prefilled Responses)
  • Avoid using markdown headers for structured task instructions when XML tags would be more precise — XML tags help Claude parse prompts unambiguously (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags; Refrase eval, 46-config study)
  • Do not use tool_choice 'any' or force specific tools when extended thinking is enabled — only 'auto' and 'none' are supported (source: Anthropic docs, Extended Thinking, Tool Use Limitations)
  • Avoid over-prompting with aggressive language like 'CRITICAL: You MUST use this tool' — newer models overtrigger on instructions designed for older models (source: Anthropic docs, Prompting Best Practices, Tool Usage)

Sources

Skip the manual application.

Refrase reads everything above and applies it for you. Try it on one of your own prompts.