Skip to content

πŸ¦™ Ollama

Status: 🟒 Installed β€” April 7, 2026 What: Open-source tool to run AI models locally on your machine Website: ollama.com

β˜• CafΓ© Analogy

Ollama is like a home coffee machine with a massive bean library. You pick a bean (model) from their huge catalogue, download it once, and brew anytime β€” no subscription, no internet needed after download. The community keeps adding new beans every day.

What is Ollama?

Ollama is the most popular tool for running AI models locally. Think of it as "Docker for AI models" β€” you pull a model, run it, and chat locally.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Your Laptop             β”‚
β”‚                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Ollama   │───▢│  AI Model  β”‚  β”‚
β”‚  β”‚ (engine)  β”‚    β”‚ (Phi-4,    β”‚  β”‚
β”‚  β”‚           β”‚    β”‚  Llama,    β”‚  β”‚
β”‚  β”‚ localhost β”‚    β”‚  Gemma...) β”‚  β”‚
β”‚  β”‚ :11434    β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚
β”‚       β–²                          β”‚
β”‚       β”‚ REST API                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚ Apps, Copilot CLI,   β”‚        β”‚
β”‚  β”‚ Open WebUI, etc.     β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quick Reference

# Install
winget install Ollama.Ollama

# Core commands
ollama --version          # Check version
ollama pull phi4-mini     # Download a model
ollama run phi4-mini      # Chat with a model
ollama list               # See all downloaded models
ollama rm phi4-mini       # Delete a model
ollama ps                 # See running models
/bye                      # Exit chat session

# API endpoint
# http://localhost:11434 (OpenAI-compatible)

Models We've Tried

Model Size Good For Notes
phi4-mini 3.8B General chat, coding Microsoft's model, runs great on Snapdragon

Planned Exercises

  • [ ] Try Gemma 3 4B (ollama pull gemma3:4b)
  • [ ] Try Llama 3.2 3B (ollama pull llama3.2:3b)
  • [ ] Compare 3 SLMs with same prompts (L49)
  • [ ] Install Open WebUI for a browser-based chat interface (L45)
  • [ ] Connect Ollama to Copilot CLI via BYOK (L60)
  • [ ] Connect MCP server to Ollama (L47)

Key Insight

Local AI has no internet access β€” it only knows what it was trained on. This is both a limitation (no live info) and a strength (total privacy, zero data leakage).