🔬 Question Enrichment Project¶

Status: Ready to start (V2 refresh — pre-launch)
Estimated effort: ~62 sessions, ~10 seconds human effort per session
Model: Claude Opus 4.6 (1M) — mandatory
Impact: 24,800 questions enriched to premium + 6,200 new hard scenario questions = 31,000 total

Why This Exists¶

Question quality varies wildly across certs. The flagship cert (AB-900) has the worst questions. Users will judge the entire platform by the first cert they try.

Cert	Scenario	Options	Explanation	Grade
AB-900 (flagship!)	128 chars	38 chars	184 chars	🔴 F
AWS SAA-C03	162 chars	45 chars	256 chars	🟠 D
CCNA	210 chars	43 chars	301 chars	🟠 C
CompTIA Sec+	335 chars	56 chars	365 chars	🟡 B
CISSP (target)	435 chars	126 chars	595 chars	✅ A

What the Pipeline Does (per cert, fully automated)¶

┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
│ Backup   │──▶│ Enrich   │──▶│ Add 50   │──▶│ Validate │──▶│ Build    │──▶│ Commit   │
│ originals│   │ existing │   │ new hard │   │ (auto)   │   │ test     │   │ + push   │
│ (.bak)   │   │ 200 Qs   │   │ scenarios│   │ FATAL?   │   │ pass?    │   │ + log    │
└──────────┘   └──────────┘   └──────────┘   └──────────┘   └──────────┘   └──────────┘
                                                  │ fail              │ fail
                                                  ▼                  ▼
                                              Restore .bak       Revert cert
                                              Skip file          Continue next

What gets enriched (text only — schema untouched):¶

scenario → 300-500 chars (person + role + company + situation)
question → 80-130 chars (specific, mentions technologies)
options[].text → 80-150 chars (full sentences, plausible wrong answers)
explanation → 400-600 chars (WHY correct, what happens if wrong)
whyWrong values → 150-250 chars each
examTip, realWorld, hint → added if missing

What gets added (50 new questions per cert):¶

All difficulty: "hard" — mini case studies
Scenarios: 500-800 chars (company, infra, constraints, goals)
Multi-layered questions ("Given X AND Y AND Z, what should admin do FIRST?")
Two plausible options + two traps per question
IDs: {cert}-d{N}-s{NN} format (e.g., ab900-d1-s01)
Distributed evenly across domain files

🔒 What NEVER changes:¶

id, domain, type, difficulty, correct, options[].id, items[].id, targets[].id, learnLink, file-level meta

Safety Rails (triple-reviewed, rubber-ducked, battle-tested)¶

Protection	How
Backup before edit	Every file copied to `.bak` before modification
Atomic writes	Enriched data written to `.tmp`, validated, then atomically replaces original
Schema validation	Checks: IDs preserved, correct answers untouched, types unchanged, option counts match, no duplicates, length minimums met
FATAL error = rollback	Any schema violation → restore `.bak`, skip file, continue
Count verification	Total must equal exactly 250 — script exits code 1 if wrong, blocks commit
Quality spot check	Checks short scenarios/explanations/options before commit
Build must pass	Astro build after each cert. Fail → revert entire cert.
Git safety	`git pull --rebase` before editing AND before pushing
Tracking log	`.enrichment-log.json` tracks done certs — sessions auto-resume where left off
No temp files committed	`.bak` and `.tmp` cleaned up before commit

🔴 Lessons Learned (from first enrichment session — 25 Apr 2026)¶

AZ-900 had bugs. AZ-140 was perfect. Root causes and fixes:

Bug	What happened	Fix in prompt
305 questions instead of 250	Model added 105 new instead of 50	Distribution formula enforced + count verification blocks commit
55 short scenarios (<200 chars)	Early questions not fully enriched	Explicit: process EVERY question, no half-enrichment
163 short options (<50 chars)	Option enrichment skipped for some	EVERY option must be 80-150 chars, no exceptions
Count check skipped	Model committed without verifying total	Hard exit-code-1 blocker script added

AZ-140 (second cert in same session) was perfect — suggests the model improves after the first cert. Consider starting with the less important cert first in each batch.

Vs Competitors¶

Platform	Questions per cert	Hard scenarios	Detailed "why wrong"	Price
MeasureUp	100-150	❌	⚬	$99/year
Whizlabs	150-200	❌	✅	$20-40
ExamTopics	200+ (crowdsourced)	❌	❌	Free + ads
Guided (after)	250	✅ 50 case studies	✅ Every question	$14-49

How to Run¶

Starter files (in `C:\ssClawy\guided\files\`):¶

File	Purpose
`question-enrichment-starter.md`	Paste into enrichment session — full pipeline instructions
`enrichment-monitor.md`	Paste into monitoring session — progress dashboard

Your workflow:¶

Terminal 1 — Enrichment (does the work):

Open Copilot CLI → /model → Claude Opus 4.6 (1M)
Paste contents of question-enrichment-starter.md
Say: "next batch"
Wait for: "✅ Batch complete. Please start a new session."
Repeat from step 1
Done when it says: "🎉 ALL CERTS ENRICHED!"

Terminal 2 — Monitor (check anytime):

Open separate Copilot CLI
Paste contents of enrichment-monitor.md
Say "check progress" to see dashboard:

╔══════════════════════════════════════════╗
║  GUIDED ENRICHMENT PROGRESS              ║
║  Enriched:  14 / 124  (11%)             ║
║  Remaining: 110                          ║
║  █████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ║
╚══════════════════════════════════════════╝

Key numbers:¶

Metric	Value
Total certs	All 124 (no exceptions)
Certs per session	2 (context window limit)
Sessions needed	~62
Your effort per session	~10 seconds (paste + "next batch")
Total human effort	~10 minutes
Questions per cert after	250 (200 enriched + 50 new hard)
Total questions after	31,000

Success Criteria¶

Metric	Before	After
Questions per cert	200	250
Total questions	24,800	31,000
Avg scenario length	258 chars	400+ chars
Avg option length	66 chars	100+ chars
Avg explanation length	359 chars	500+ chars
Hard scenario questions	0	6,200
User perception	"Too easy"	"Feels like the real exam"