🔬 SLMs & Local AI¶

Status: 🟡 In progress — Ollama + Phi-4-mini installed April 7, 2026
Goal: Understand Small Language Models, run AI locally on your Snapdragon laptop, and build hands-on projects

✅ What We've Done So Far¶

Date	What	Status
2026-04-07	Installed Ollama via `winget install Ollama.Ollama`	✅
2026-04-07	Downloaded & ran Phi-4-mini (`ollama run phi4-mini`)	✅
2026-04-07	Discovered: local AI has no web access — frozen at training date	✅ Learned

Key Insight: Local AI vs Cloud AI¶

	Local SLM (Phi-4-mini)	Cloud AI (ChatGPT/Copilot)
Knowledge	Frozen at training date (~2024)	Can search the live web
Internet	❌ None — fully offline	✅ Connected
Privacy	🔒 Data never leaves your laptop	☁️ Data sent to cloud servers
Cost	Free forever	Subscription / API fees
Speed	Instant (no network latency)	Depends on internet

Quick Reference: Ollama Commands¶

ollama --version          # Check version
ollama pull phi4-mini     # Download a model
ollama run phi4-mini      # Chat with a model
ollama list               # See all downloaded models
ollama rm phi4-mini       # Delete a model
/bye                      # Exit chat session

Why This Matters¶

SLMs are the hottest trend in AI — small enough to run on your laptop, powerful enough to be useful. Privacy, speed, zero cost. Perfect for demos and understanding how AI works "under the hood".

Planned Topics¶

Topic	Description
LLM vs SLM	Big cloud brain vs small local brain — comparison table
Why SLMs matter	Privacy, speed, cost, offline use
The SLM landscape	Phi-4 (Microsoft), Gemma 3 (Google), Llama 3.2 (Meta), Qwen 3, Mistral
Ollama — your local AI engine	Install, run models, chat locally on your Snapdragon
Quantisation explained	How Q4/Q8 models shrink to fit your RAM
Foundry Local — Microsoft's on-device AI	Install, run Phi-4-mini, compare to Ollama, NPU acceleration
Copilot CLI + BYOK	Connect Ollama or Foundry Local to Copilot CLI for free offline agentic coding
🔨 Project: Local chatbot	Run Phi-4-mini on your laptop — zero cloud
🔨 Project: Document summariser	Feed local PDFs to an SLM for private summaries
🔨 Project: MCP + SLM	Connect an MCP server to a local Ollama model
SLM vs cloud LLM decision framework	When to use which — for customer conversations
Microsoft Phi deep dive	Phi-4, Phi-4-mini, Windows AI strategy

☕ Café Analogy¶

A cloud LLM is like ordering from Uber Eats (powerful but needs internet and costs money). An SLM is like your own coffee machine at home (smaller menu but instant, free, and private).

Your Hardware¶

You have a Snapdragon X Elite with 32GB RAM — this can comfortably run models up to 14B parameters. That's Phi-4, Gemma 3 4B, Llama 3.2 3B, and even some 7B models.