Claude Sonnet 4.6

Anthropic · claude family · Official docs

Claude Sonnet 4.6 is the workhorse of the Claude family — it hits a sweet spot of intelligence, speed, and cost that makes it the default choice for most production workloads. The 1M context window at standard pricing (a major improvement over Sonnet 4.5's premium long-context surcharge) makes it competitive with Opus for document-heavy tasks at 40% lower cost. The critical tuning dimension for Sonnet 4.6 is the effort parameter: unoptimized prompts inherit the default 'high' effort, which burns thinking tokens unnecessarily on simple tasks. Refrase's adaptation layer automatically converts markdown-structured prompts to XML-tagged prompts, which our 46-configuration study showed yields 12% average improvement. For latency-sensitive applications, the recommended pattern is effort=low with thinking disabled, which delivers similar or better performance vs Sonnet 4.5 at much lower latency. For agentic coding, adaptive thinking at medium effort with interleaved mode provides the best quality-to-cost ratio.

Try Refrase on a Claude Sonnet 4.6 prompt

Paste any prompt — Refrase rewrites it using Claude Sonnet 4.6's documentation as context. 4–7 seconds end-to-end.

Open in /enhance Try Guided mode

Specifications

Context window

64K

Max output

$3 / $15

Per 1M tokens (in/out)

Batch API: $1.50/$7.50 per MTok (50% discount). Prompt caching: 5-min write 1.25x, 1-hour write 2x, cache hit 0.1x base input. 1M context window at standard pricing (no long-context surcharge unlike Sonnet 4.5). (source: Anthropic docs, Pricing)

Strengths

extractionanalysisgenerationcode

Key capabilities

✓Best combination of speed and intelligence — optimized for workloads where fast turnaround and cost efficiency matter most (source: Anthropic docs, Models Overview)
✓1M token context window at standard pricing with no long-context surcharge, unlike predecessor Sonnet 4.5 which required beta header and premium pricing above 200K (source: Anthropic docs, Pricing, Long Context Pricing)
✓Supports both adaptive thinking and manual extended thinking with interleaved mode, providing flexible reasoning control (source: Anthropic docs, Prompting Best Practices, Leverage Thinking)
✓Best-in-class accuracy on computer use evaluations using adaptive thinking mode (source: Anthropic docs, Prompting Best Practices, When to Try Adaptive Thinking)
✓Structured outputs with guaranteed JSON schema compliance via json_schema format and strict tool use (source: Anthropic docs, Structured Outputs)
✓Defaults to effort level of 'high'; adjustable to medium or low for latency-sensitive workloads (source: Anthropic docs, Prompting Best Practices, Migrating from Sonnet 4.5 to Sonnet 4.6)
✓Excels at parallel tool execution — runs multiple speculative searches, reads several files at once, and executes bash commands in parallel (source: Anthropic docs, Prompting Best Practices, Optimize Parallel Tool Calling)

Known limitations

⚠Prefilled responses on the last assistant turn are no longer supported — must use structured outputs, tool calling, or explicit instructions instead (source: Anthropic docs, Prompting Best Practices, Migrating Away from Prefilled Responses)
⚠Default 'high' effort level may cause higher latency than Sonnet 4.5; must explicitly set effort to 'medium' or 'low' for latency-sensitive applications (source: Anthropic docs, Prompting Best Practices, Migrating from Sonnet 4.5 to Sonnet 4.6)
⚠Extended thinking with vision is not compatible (source: Anthropic docs, Extended Thinking, Constraints)
⚠Cannot toggle thinking during a single assistant turn including tool use loops — the entire turn must use consistent thinking mode (source: Anthropic docs, Extended Thinking, Toggling Thinking)
⚠When using extended thinking, only tool_choice 'auto' or 'none' is supported — cannot use 'any' or force specific tools (source: Anthropic docs, Extended Thinking, Tool Use Limitations)

How to prompt Claude Sonnet 4.6

Preferred instruction format

XML tags (<instructions>, <context>, <output_format>, <examples>) for structured prompts. System prompt via the 'system' API parameter. Role-setting in system prompt focuses behavior and tone.

Recommended practices

Use XML tags to structure complex prompts — wrap each content type in descriptive tags like <instructions>, <context>, <input> to reduce misinterpretation (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags)
For coding use cases, start with medium effort and budget_tokens around 16K; for chat/non-coding, start with low effort (source: Anthropic docs, Prompting Best Practices, Migrating from Sonnet 4.5 to Sonnet 4.6)
Switch from adaptive to extended thinking with a budget_tokens cap for a hard ceiling on thinking costs while preserving quality (source: Anthropic docs, Prompting Best Practices, Overthinking)
Provide 3-5 diverse examples in <example> tags for few-shot prompting; include <thinking> tags inside examples to demonstrate reasoning patterns (source: Anthropic docs, Prompting Best Practices, Use Examples Effectively; Thinking and Reasoning)
Place longform data at the top of prompts, above queries and instructions — queries at end improve quality by up to 30% (source: Anthropic docs, Prompting Best Practices, Long Context Prompting)
Set max output token budget to 64K at medium or high effort to give the model room to think and act (source: Anthropic docs, Prompting Best Practices, Migrating from Sonnet 4.5 to Sonnet 4.6)

Anti-patterns to avoid

Do not use prefilled assistant responses on the last turn — deprecated in Claude 4.6; use structured outputs or explicit instructions instead (source: Anthropic docs, Prompting Best Practices, Migrating Away from Prefilled Responses)
Avoid leaving effort unset — Sonnet 4.6 defaults to 'high' which may cause unexpected latency increases vs Sonnet 4.5 (source: Anthropic docs, Prompting Best Practices, Migrating from Sonnet 4.5 to Sonnet 4.6)
Avoid using markdown headers for structured task instructions when XML tags would be more precise (source: Anthropic docs, Prompting Best Practices, Structure Prompts with XML Tags; Refrase eval, 46-config study)
Do not use tool_choice 'any' or force specific tools when extended thinking is enabled (source: Anthropic docs, Extended Thinking, Tool Use Limitations)

Sources

Compare prompting style with another model

vs GPT-4o vs Gemini 2.5 Pro vs Qwen3 235B vs DeepSeek V3 vs Mistral Large 3

Skip the manual application.

Refrase reads everything above and applies it for you. Try it on one of your own prompts.

Open /enhance with Claude Sonnet 4.6