🔬 SLMs & Local AI¶
Status: 🟡 In progress — Ollama + Phi-4-mini installed April 7, 2026
Goal: Understand Small Language Models, run AI locally on your Snapdragon laptop, and build hands-on projects
✅ What We've Done So Far¶
| Date | What | Status |
|---|---|---|
| 2026-04-07 | Installed Ollama via winget install Ollama.Ollama |
✅ |
| 2026-04-07 | Downloaded & ran Phi-4-mini (ollama run phi4-mini) |
✅ |
| 2026-04-07 | Discovered: local AI has no web access — frozen at training date | ✅ Learned |
Key Insight: Local AI vs Cloud AI¶
| Local SLM (Phi-4-mini) | Cloud AI (ChatGPT/Copilot) | |
|---|---|---|
| Knowledge | Frozen at training date (~2024) | Can search the live web |
| Internet | ❌ None — fully offline | ✅ Connected |
| Privacy | 🔒 Data never leaves your laptop | ☁️ Data sent to cloud servers |
| Cost | Free forever | Subscription / API fees |
| Speed | Instant (no network latency) | Depends on internet |
Quick Reference: Ollama Commands¶
ollama --version # Check version
ollama pull phi4-mini # Download a model
ollama run phi4-mini # Chat with a model
ollama list # See all downloaded models
ollama rm phi4-mini # Delete a model
/bye # Exit chat session
Why This Matters¶
SLMs are the hottest trend in AI — small enough to run on your laptop, powerful enough to be useful. Privacy, speed, zero cost. Perfect for demos and understanding how AI works "under the hood".
Planned Topics¶
| Topic | Description |
|---|---|
| LLM vs SLM | Big cloud brain vs small local brain — comparison table |
| Why SLMs matter | Privacy, speed, cost, offline use |
| The SLM landscape | Phi-4 (Microsoft), Gemma 3 (Google), Llama 3.2 (Meta), Qwen 3, Mistral |
| Ollama — your local AI engine | Install, run models, chat locally on your Snapdragon |
| Quantisation explained | How Q4/Q8 models shrink to fit your RAM |
| Foundry Local — Microsoft's on-device AI | Install, run Phi-4-mini, compare to Ollama, NPU acceleration |
| Copilot CLI + BYOK | Connect Ollama or Foundry Local to Copilot CLI for free offline agentic coding |
| 🔨 Project: Local chatbot | Run Phi-4-mini on your laptop — zero cloud |
| 🔨 Project: Document summariser | Feed local PDFs to an SLM for private summaries |
| 🔨 Project: MCP + SLM | Connect an MCP server to a local Ollama model |
| SLM vs cloud LLM decision framework | When to use which — for customer conversations |
| Microsoft Phi deep dive | Phi-4, Phi-4-mini, Windows AI strategy |
☕ Café Analogy¶
A cloud LLM is like ordering from Uber Eats (powerful but needs internet and costs money). An SLM is like your own coffee machine at home (smaller menu but instant, free, and private).
Your Hardware¶
You have a Snapdragon X Elite with 32GB RAM — this can comfortably run models up to 14B parameters. That's Phi-4, Gemma 3 4B, Llama 3.2 3B, and even some 7B models.