Skip to content

Ollama Launch lets you run Claude Code for free

Ema Suriano
Ema Suriano at

2 min read

-- views

I just discovered Ollama’s Launch command – a super‑simple way to run Claude locally for free. No environment variables, no config files, just a single command.

How I got it going

  1. Install Claude Code (if you haven’t already):
    brew install claude-code
    
  2. Install Ollama (v0.15+):
    brew install ollama
    
  3. Launch Claude:
    ollama launch claude
    
    The first run will prompt you to pick a model – hit Enter for the default.
  4. Use Claude Code as usual:
    claude-code "Explain the difference between GPT‑4 and Claude"
    

Why I love this

  • Zero cost – no subscription fees.
  • Speed – local inference, tiny latency.
  • Control – pick any model you like.
  • Privacy – your prompts never leave your machine.

Key details

  • VRAM & context: Top‑end models need ~23 GB VRAM and a 64 k token context window (adjustable in Ollama settings).
  • Pull models (optional):
    # Local model (≈23 GB VRAM)
    ollama pull glm-4.7-flash
    # Cloud model with full context
    ollama pull glm-4.7:cloud
    

Supported integrations: Claude Code, OpenCode, Codex, Droid (ollama launch <tool>).

Recommended models

  • Local: glm-4.7-flash, qwen3-coder, gpt-oss:20b
  • Cloud: glm-4.7:cloud, minimax-m2.1:cloud, gpt-oss:120b-cloud, qwen3-coder:480b-cloud

Extended sessions & pricing: Ollama now offers a 5‑hour coding session on the free tier. Details at https://ollama.com/pricing.

Configure without launching:

ollama launch opencode --config

For the full story, see Ollama’s blog post: https://ollama.com/blog/launch

Related articles