Skip to content

Architecture Gate + Data-First Sequence — combined playbook

The deep version of soul-file Rules #5 (Co-founder Architecture Gate) and #6 (Data-First Sequence). These two rules pair together — Rule #5 governs the design phase, Rule #6 governs the implementation phase. Skipping either is what produced the PAC and atlas-portfolio failure arcs.

Why this lives in the annex: the soul-file entries (Rules #5 and #6 in ~/.copilot/copilot-instructions.md) are intentionally compact so they reload fast every session. This playbook is read on-demand when: - Atlas catches itself about to skip either rule and needs the full pattern - Starting a new tool / dashboard / report / scrape-driven build - Sush asks "what made the last big build actually work?"

Last updated 2026-05-25 after the atlas-portfolio v0.4.4-v0.4.7 ship.


The two-rule pairing

Rule Phase What it forces Failure if skipped
🚨 #5 Architecture Gate Design Surface ≥2 alternative paths out loud with trade-offs, pick one with reasoning, before keyboard moves Building on a flawed architecture — every "fix" makes the design more entrenched
🚨 #6 Data-First Sequence Implementation Probe → Scrape → Load → Render. Never start at render. Building loaders/renderers against assumed data shapes; rework cascades downstream

These rules are independent triggers but they reinforce each other. A correct architecture decision can still be poorly implemented (skipped Rule #6 = days of shape-mismatch debugging). A correct data-first implementation can still be wasted on a flawed architecture (skipped Rule #5 = building the right code for the wrong design).


The atlas-portfolio case study (May 22-25 2026)

The 4-day failure pattern (v0.4.0 → v0.4.3)

Goal: enrich the customer dashboard with more Lynx Copilot data per account.

Version What we tried Why it failed Root cause skipped
v0.4.0 Added more KPI tiles to the existing popup detail panel Popup couldn't fit the data; broke when account had many opps Rule #5 — never asked "is the popup the right surface for this volume?"
v0.4.1 Patched MACC/ACR scope leak that surfaced during v0.4.0 build Symptom not cause: the scope filter was loose because data-fetch was loose Rule #6 — built the filter before probing what categories the data actually contains
v0.4.2 Lynx as primary Copilot data source — per-tenant scrape + UI Worked for 22 of 52 tenants; matcher matched wrong tenants for the rest Rule #6 — wrote tenant matcher before probing how Lynx exposes tenant names (display vs MSX vs ID)
v0.4.3 Patch the matcher; preserve-existing-data guard Got us to 42 of 54 — but still couldn't push past, and the popup architecture was still constraining Both rules — neither stopped to ask "is THIS architecture working?"

Each version was a "just one more fix" away from working. Each kept the load-bearing decisions invisible. End-of-session screen looked nearly identical to start-of-session for three consecutive nights — a textbook Rule #5 smell signal.

The 80-minute success (v0.4.4 → v0.4.7)

Same goal (more Lynx data per account, with three new specific asks from Sush).

Step 1 — Rule #5 fires first. Before touching any code, Atlas wrote a comparison table in chat:

Path Approach Speed Reliability Effort Long-term cost Verdict
A · Hash-route fullscreen panel #/account/{id} reads URL hash → renders existing detail body as a full viewport view in same HTML ~1 hr high low one giant HTML, no real page semantics OK for fastest pivot
B · Per-account static HTML files dist/account/{tpid}.html per customer; sidebar with all customers for cross-nav; cards link via <a href> ~3 hr high medium clean: real URLs, real back-button, browser-tab parallel, bookmarkable per account RECOMMENDED
C · SPA pushState router Single page with client-side router replacing the view ~2 hr medium medium over-engineered for static-HTML DNA reject
D · Hybrid (popup + open-as-page) Keep popup, add "open as page" link ~2 hr high medium two paths for same view, duplicated logic reject

Sush picked B. The next 80 minutes were execution against a clear shape, not exploration.

Step 2 — Rule #6 fires before scraper code. Three probe scripts in 20 minutes:

scripts/lynx-tenant-deep-probe.mjs        # 5 min — discovered only 3 of 14 guessed sub-paths actually exist
scripts/lynx-deep-licenses-probe.mjs       # 10 min — found /reports/your-tenants has 19 cols (paid Chat + free Chat + sentiment)
scripts/lynx-scroll-container-probe.mjs    # 5 min — found the virtualized scroll container needed for 52 rows

After 20 minutes of probing, we knew: - Lynx exposes per-tenant App Usage, Licenses, and Sentiment pages — nothing else useful - The portfolio rollup goldmine is /reports/your-tenants (52 rows × 19 columns) - The home page tile has M365 Core App WAU (the seat-count proxy for sales whitespace) - The admin-settings cohort defaults to S500 tenants but has a tenant search box for per-tenant queries

Each scraper, loader, and renderer was then written against shapes already proven to exist. Zero rework.

Result: 4 tagged releases (v0.4.4 → v0.4.7) in one ~80min session. All green. All shipping what Sush asked for plus extras (SVG charts, top portfolio signals, recommendation engine, refresh orchestrator).


The probe-script pattern

Every data-driven build benefits from a probe-script step. The shape:

What a probe script looks like

// scripts/<source>-<aspect>-probe.mjs
// 5-15 min throwaway. Hits the REAL source. Writes a sample to session-state/<sid>/files/.
import playwright from "...";

async function main() {
  const browser = await chromium.connectOverCDP("http://127.0.0.1:9222");
  const page = await browser.contexts()[0].newPage();
  await page.goto(REAL_URL_NOT_A_FIXTURE);
  await page.waitForTimeout(8000);

  const data = await page.evaluate(() => ({
    title: document.title,
    visibleTables: [...document.querySelectorAll("table, [role='grid']")].map(t => ({
      headers: [...t.querySelectorAll("th")].map(h => h.textContent?.trim()),
      sampleRows: [...t.querySelectorAll("tr")].slice(0, 5).map(tr => [...tr.querySelectorAll("td")].map(c => c.textContent?.trim())),
    })),
    interestingKeywords: /* whatever you expect to find */,
    bodyHead: document.body.textContent?.slice(0, 1000),
  }));

  await writeFile(`${SESSION_FILES}/probe-output.json`, JSON.stringify(data, null, 2));
  console.log("DONE", SESSION_FILES);
}

What it produces

Real JSON in session-state/<sid>/files/<name>-probe.json showing what the source actually returns — table headers, sample rows, KPI tiles, ARIA labels, sub-paths discovered, scroll containers identified. Atlas inspects this output BEFORE writing any scraper/loader/renderer code.

Probe-script rules

  1. Hit the REAL source. No fixtures, no mocks (would violate Standing Rule #1 anyway).
  2. Throwaway. Probe scripts are scaffolding — keep them in scripts/ but never depend on them in the build pipeline. The scraper that ships should be its own clean file.
  3. Write output to disk. JSON.stringify(data, null, 2) to a session-state file. The output is the artefact, not the script.
  4. Log a one-line summary. Probe output is for Atlas inspection — print enough console output that Atlas can read it in 30 seconds.
  5. Don't over-engineer the probe. It's meant to die after one use. Hardcoded selectors, hardcoded URLs, hardcoded sample size — all fine.

When to write multiple probes

If a single probe surfaces "this is more complex than I thought," split into multiple probes rather than trying to handle every case in one. atlas-portfolio used 5 probes total in 25 minutes: - 2 sub-path discovery probes - 1 scroll-container probe - 1 admin-settings tenant search probe - 1 seat-count location probe

Each probe answered ONE question. Each took <10 minutes. Together they unlocked the entire scraper architecture.


How to recognise the rules should fire (and you're skipping them)

Rule #5 (Architecture Gate) smell signals

  • "Just one more fix and it'll work" said twice in the same week
  • A workaround being built to mask a symptom
  • Same component failing different way each session
  • End-of-session screen looks the same as start-of-session, twice in a row
  • A pivot decided in a single session that closes off a path you hadn't enumerated
  • Sush proposes a path and Atlas accepts it without surfacing alternatives — even when Atlas thinks the proposed path is fine. The rule isn't about disagreement; it's about making the trade-offs visible so Sush can pick informed.

Rule #6 (Data-First) smell signals

  • Renderer code with lots of c.x?.y?.z ?? '—' defensive chains
  • "It works on the test data but breaks on the real data" mismatches
  • More than 3 mock / fallback paths in a single render function
  • Loader functions that branch on multiple possible shapes of the same field
  • A debugging session that starts with console.log on the data before realising the field name is wrong
  • A "v0.x.1 fix" that turns out to be a data-shape correction, not a real bug fix
  • Sush asks "is the data in the dashboard real?" — the question itself is a signal the build skipped a probe step

Authority clause (for Atlas to override execution momentum)

Both rules are HARD RULES. Sush has explicitly authorised Atlas to interrupt execution when either rule is being skipped — even in mid-flow, even when Atlas itself has been moving fast. The interruption format:

🚨 Rule #5 / #6 check: I'm noticing I'm about to [skip alternatives / start rendering without probing]. Pausing to [surface the paths / write the probe] before continuing.

This is not asking permission. It's announcing the gate fire and then doing the work the rule requires. Sush can override the pause if he's already comfortable with the decision (express bypass), but the default is to honour the rule.

The reason for this authority: every previous failure of these rules involved Atlas + Sush BOTH being too close to the work to see the smell signals. The rules exist precisely to fire when nobody is watching for them.


Checklist for the next big build

When starting a new tool / dashboard / report:

  1. ☐ State the goal in one sentence, out loud.
  2. ☐ List the load-bearing decisions (auth, data source, framework, runtime, routing). For each, are there ≥2 reasonable alternatives?
  3. ☐ Write the comparison table (Rule #5).
  4. ☐ Wait for Sush's pick (or express disagreement and propose a different path).
  5. ☐ List the data sources the build will touch.
  6. ☐ Write one probe script per source (Rule #6). Run each. Inspect output.
  7. ☐ Document what each source's real shape is (in scratch notes or session-state files).
  8. ☐ Write the scraper against the proven shape.
  9. ☐ Write the loader that normalises the scraped shape into a clean structure.
  10. ☐ Write the renderer last, against the clean structure.
  11. ☐ Smoke test against real data (NOT synthetic — Standing Rule #1).
  12. ☐ Ship in small tagged releases (v0.x.1, v0.x.2 ...) rather than one big drop.

The 4 days that didn't work, didn't honour steps 2-4 or steps 5-7. The 80 minutes that did work, honoured all 12.


Cross-references

  • Soul file Rule #5 (compact version): ~/.copilot/copilot-instructions.md § 🚨 #5 RULE
  • Soul file Rule #6 (compact version): ~/.copilot/copilot-instructions.md § 🚨 #6 RULE
  • atlas-portfolio BUILD-LOG: C:\ssClawy\atlas-portfolio\BUILD-LOG.md (Phase 10-13 entries)
  • Session journal entry: ~/.copilot/session-journal.md (24 May 2026 overnight)
  • Memory system architecture (where these rules live): learning-docs/docs/reference/memory-system-architecture.md