ComputeCalcComputeCalc
AI Model Advisor

Pick the right model for the job

Compare strengths, pros and cons across leading LLMs — or describe your use case and let the advisor recommend the best fit.

Ask the advisor

Priorities
Filter
9 models
Anthropic

Claude Haiku 4.5

Economy
200K ctx
$1 in / 1M$5 out / 1M
Latency
Cost
Tool use
Best for
  • High-volume classification
  • Routing & extraction
  • Customer support triage
Avoid for
  • Deep multi-step reasoning
  • Long-form creative writing
Pros
  • Very fast
  • Cheap per token
  • Strong instruction following
  • Vision support
Cons
  • Weaker on complex math
  • Smaller knowledge depth than Opus
Fit by use case
RAG over docsFast retrieval-augmented answers at scale
Agentic workflowsGood for simple agents; escalate hard steps
Code generationSnippets yes; large refactors no
Legal/medical analysisUse Sonnet/Opus for accuracy
Anthropic

Claude Sonnet 4.5

Balanced
200K ctx
$3 in / 1M$15 out / 1M
Reasoning
Coding
Reliability
Best for
  • Production agents
  • Code generation
  • Document analysis
  • Structured extraction
Avoid for
  • Throwaway classification (use Haiku)
  • Image generation
Pros
  • Best price/quality balance
  • Strong reasoning
  • Reliable tool calls
  • Long context
Cons
  • Slower than Haiku
  • Costlier than Gemini Flash
Fit by use case
RAG over docsTop-tier extraction quality
Agentic workflowsDefault choice for prod agents
Code generationExcellent multi-file edits
Legal/medical analysisStrong accuracy + citations
Anthropic

Claude Opus 4.6

Premium
200K ctx
$5 in / 1M$25 out / 1M
Reasoning depth
Nuance
Best for
  • Frontier research tasks
  • Complex planning
  • High-stakes analysis
Avoid for
  • High-volume cheap workloads
  • Realtime UX
Pros
  • Best Anthropic reasoning
  • Nuanced writing
  • Deep analysis
Cons
  • Expensive
  • Slower
Fit by use case
RAG over docsOverkill unless docs are very complex
Agentic workflowsBest for long-horizon planning
Code generationTop quality, watch cost
Legal/medical analysisHighest accuracy tier
OpenAI

GPT-5

Premium
400K ctx
$2.5 in / 1M$10 out / 1M
Multimodal
Tool use
Reasoning
Best for
  • Multimodal reasoning
  • Tool-heavy agents
  • Vision + text
Avoid for
  • Cost-sensitive bulk processing
Pros
  • Strong all-rounder
  • Excellent tool use
  • Native multimodal
Cons
  • Pricier than Gemini
  • Variable latency
Fit by use case
RAG over docsExcellent extraction + reasoning
Agentic workflowsBest-in-class tool calling
Code generationStrong, especially with reasoning effort
Legal/medical analysisHigh accuracy with citations
OpenAI

GPT-5 Mini

Balanced
400K ctx
$0.6 in / 1M$2.4 out / 1M
Cost
Speed
Multimodal
Best for
  • Cost-aware production
  • Chatbots
  • Mid-complexity agents
Avoid for
  • Frontier reasoning tasks
Pros
  • Great $/quality ratio
  • Fast
  • Multimodal
Cons
  • Less nuanced than GPT-5
Fit by use case
RAG over docsSweet spot for production RAG
Agentic workflowsGood for simpler agents
Code generationDecent; GPT-5 for hard problems
Legal/medical analysisVerify with human reviewer
Google

Gemini 2.5 Pro

Premium
2M ctx
$1.25 in / 1M$5 out / 1M
Context size
Multimodal
Cost
Best for
  • Massive context tasks
  • Video + audio understanding
  • Repo-wide code analysis
Avoid for
  • Workloads needing strict EU residency (use Mistral)
Pros
  • Huge 2M context
  • Cheapest premium tier
  • Native multimodal incl. video
Cons
  • Tool calling less battle-tested than OpenAI
Fit by use case
RAG over docsCan skip chunking for many use cases
Agentic workflowsCatching up; OpenAI/Anthropic lead
Code generationExcellent for repo-scale edits
Legal/medical analysisLong contracts fit in one prompt
Google

Gemini 2.5 Flash

Balanced
1M ctx
$0.3 in / 1M$1.2 out / 1M
Cost
Latency
Context
Best for
  • High-volume multimodal
  • Long-context summarization
  • Live chat
Avoid for
  • Hardest reasoning tasks
Pros
  • Very cheap
  • Fast
  • 1M context
  • Multimodal
Cons
  • Less nuance than Pro
Fit by use case
RAG over docsDefault cost-effective RAG model
Agentic workflowsFine for simple chains
Code generationOK for snippets
Legal/medical analysisCombine with strict prompts
Meta

Llama 3.3 70B

Balanced
128K ctx
$0.6 in / 1M$0.9 out / 1M
Privacy
Customization
Cost at scale
Best for
  • On-prem / VPC deployment
  • Data-sensitive workloads
  • Fine-tuning
Avoid for
  • Plug-and-play vision tasks
Pros
  • Open weights
  • Run anywhere
  • Cheap output
  • Fine-tunable
Cons
  • Self-hosting complexity
  • Weaker than frontier models on hard tasks
Fit by use case
RAG over docsGreat when data can't leave VPC
Agentic workflowsUse 405B for complex agents
Code generationReasonable; not SOTA
Legal/medical analysisKeep PHI/PII on-prem
Mistral

Mistral Large 2

Premium
128K ctx
$2 in / 1M$6 out / 1M
EU compliance
Function calling
Best for
  • EU data residency
  • Function calling
  • Multilingual EU languages
Avoid for
  • Vision tasks
  • Massive 1M+ context needs
Pros
  • EU-hosted option
  • Strong function calling
  • Good multilingual
Cons
  • Smaller ecosystem
  • No native vision
Fit by use case
RAG over docsStrong EU-resident RAG choice
Agentic workflowsFunction calling is a strength
Code generationUse Codestral for code-specific
Legal/medical analysisPairs well with GDPR needs