Hardware
From a phone to an always-on AI appliance, cheapest to most powerful. “Runs up to ~NB” is the largest model that fits in memory at Q4 compression. List refreshes automatically every week.
Budget
Windows laptop 16GB (integrated GPU)
LaptopNo real GPU — runs on CPU + system RAM. Comfortable up to ~12B at Q4.
Android flagship (12GB)
PhoneRuns 1–4B models via PocketPal / local runtimes. On-device, offline.
iPhone Pro (8GB)
Phone1–4B models via LM Studio mobile / PocketPal. Neural engine helps.
MacBook Air M4 (16GB)
LaptopUnified memory makes this punch above Windows peers. ~12B sweet spot.
Mid-range
Desktop + RTX 4060 Ti (16GB VRAM)
GPU rig16GB VRAM fits ~24B at Q4 fast. Best price/performance discrete GPU.
Desktop + RTX 5080 (24GB VRAM)
GPU rig24GB VRAM runs ~32B at Q4 fast. Strong single-GPU coding rig.
MacBook Pro M5 (24GB)
LaptopComfortable to ~24B. Great portable local-AI machine.
Pro
Desktop + RTX 5090 (32GB VRAM)
GPU rigFastest single consumer GPU. ~70B at Q4 with offload, ~32B fully on VRAM.
MacBook Pro M5 Max (64GB)
LaptopRuns ~70–110B at Q4 in unified memory. Portable workstation.
Max
Nvidia DGX Spark (128GB unified)
AI applianceDesktop AI appliance built to stay on 24/7. 128GB unified memory, runs 70B+ and large MoE locally. The 'own your intelligence' endgame.
Mac Studio M5 Max (128GB)
Desktop128GB unified memory holds 110B dense or large MoE at Q4. Quiet, low-power.
Mac Studio M5 Ultra (256GB)
Desktop256GB unified — runs frontier-class MoE (Kimi K2.6, DeepSeek V4) at Q4. Top of the prosumer ladder.