Model Benchmarks

Every top model on the market — free and paid, cloud and local — ranked by the benchmarks people actually cite. Updated daily.

Updated: 2026-06-16

#	Model	Pricing					Best at	Access
1	Claude Opus 4.8 Anthropic	Free + Paid $75/M out	74	1452	82%	92%	AgenticCodingReasoning	Claude.ai · free
2	GPT-5.2 OpenAI	Free + Paid $60/M out	72	1448	78%	94%	ReasoningMathCoding	ChatGPT · free
3	Gemini 3 Pro Google DeepMind	Free + Paid $40/M out	71	1445	74%	90%	Long contextVisionResearch	Gemini · free Google AI Studio · free
4	Claude Sonnet 4.6 Anthropic	Free + Paid $15/M out	69	1430	77%	86%	CodingAgenticWriting	Claude.ai · free
5	Grok 4.1 xAI	Free + Paid $30/M out	68	1425	70%	88%	ReasoningResearchAnalysis	Grok (X) · free
6	DeepSeek V4 DeepSeek · Open	Free + Paid $2/M out	66	1418	72%	89%	CodingMathReasoning	DeepSeek Chat · free OpenRouter (free) · free
7	Qwen3.5 Max Alibaba · Open	Free + Paid $3/M out	65	1415	69%	87%	CodingMultilingualReasoning	OpenRouter (free) · free Ollama · free
8	Llama 4 Maverick Meta · Open	Free + Paid $2/M out	61	1400	62%	80%	Long contextGeneralWriting	OpenRouter (free) · free Ollama · free
9	Mistral Large 3 Mistral AI · Open	Free + Paid $6/M out	60	1395	60%	78%	MultilingualWritingCoding	Le Chat · free Ollama · free
10	GPT-5.2 mini OpenAI	Free + Paid $2/M out	58	1388	58%	82%	GeneralWritingCoding	ChatGPT (free) · free
11	Gemini 3 Flash Google DeepMind	Free + Paid $1/M out	57	1382	55%	79%	Long contextDataVision	Google AI Studio (free) · free Gemini · free
12	FLUX.2 Black Forest Labs · Open	Free + Paid	—	1180	—	—	Image gen	ComfyUI (local) · free
13	Veo 3 Google DeepMind	Paid	—	—	—	—	Video gen	Google Flow / Gemini
14	Whisper large-v3 OpenAI · Open	Free	—	—	—	—	Audio	Local (faster-whisper) · free

Benchmark figures track each model's published scores (Artificial Analysis Intelligence Index, LMArena, MMLU-Pro, GPQA, SWE-bench, AIME) and are refreshed daily. Treat them as a guide, not gospel.