← All models
Gemma 4 1B
Phone / tinyGoogle · 1B parameters · released 2026-03 · Gemma · runs on iOS & Android
✓ Free to run locally☁ Free on OpenRouter
Runs on a phone. Great for on-device drafting and simple chat.
Best used for
WritingGeneral
Text
Memory needed (GB)
More compression (Q4) and a smaller context window both lower the RAM this model needs. A bigger context window is not free — watch the numbers climb to the right.
| Quant | 4K ctx | 8K ctx | 32K ctx | 128K ctx |
|---|---|---|---|---|
| Q4 | 0.6 | 0.6 | 0.7 | 1 |
| Q8 | 1.2 | 1.2 | 1.5 | 2 |
| FP16 | 2.3 | 2.4 | 2.9 | 4 |
Ways to run it
✓
On your own machine — free & private
LM Studio (search)
gemma-4-1bOllama (local)
ollama run gemma4:1b☁
Or as a hosted API — optional, for when you're away
Reachable on OpenRouter with a free tier — no per-token bill on the free model id. Same one key also works with Ollama's cloud option and most chat apps.
OpenRouter model id
google/gemma-4-1b:freeNew to this? Local vs. cloud, and what's actually free →
Source: https://huggingface.co/google · verified 2026-06-15