atlas-portfolio — data pipeline playbook¶
The source of truth for what gets scraped, in what order, into what file, and how to verify the result. Read this before any atlas-portfolio refresh, scrape, or "why is the dashboard showing X?" investigation. Update this file whenever a new scraper, data source, loader, or QA check is added.
Repo: C:\ssClawy\atlas-portfolio\ · Set: 2026-05-25 · Companion: atlas-portfolio-architecture.md (the why); this is the how.
TL;DR — the weekly refresh in one block¶
# 0. Prereqs
pwsh C:\ssClawy\atlas-portfolio\scripts\launch-edge-cdp.ps1 # if Edge isn't already on :9222
# Manual: open https://lynx.office.net/ and https://msxinsights.microsoft.com/ in that Edge window
# Manual: `az login` if last refresh was >7 days ago
# 1. One-command full refresh (chains MSX → Lynx scrapes → build)
cd C:\ssClawy\atlas-portfolio
node scripts/refresh-full.mjs --verbose
# 2. Post-refresh QA (mandatory — don't ship to Sush without this)
node scripts/lynx-verify-dashboard.mjs # cards render, admin posture present
node scripts/screenshot-v044-account.mjs # per-account chips: 5 on / 1 partial / 6 off / 3 neutral for ASB
node scripts/smoke-e2e.mjs # all 7 tabs, no console errors, 54 account pages
# 3. Open the dashboard
start dist/index.html
If ANY of those reports an error, DO NOT message Sush "refresh done" — fix or surface first. See § Failure modes.
1. Source-of-truth map¶
Every piece of data in the dashboard comes from one of these portals. The table below is the canonical inventory; if you add a new scrape, add a new row here in the same PR.
1.1 MSX (Dataverse, via msx-mcp stdio subprocess)¶
| Source | URL / API | Script | Output file | Consumers |
|---|---|---|---|---|
| Customer list (54 accounts) | msx-mcp accountsList |
scripts/generate.js |
data/snapshot-customers-<date>.json → customers[] |
render.js + every loader/renderer |
| Opportunities per customer | msx-mcp opportunitiesByAccount |
scripts/generate.js |
same → customers[].opps[] |
Opportunities tab + per-account opp tables |
| Account team / contacts | msx-mcp teamByAccount / contactsByAccount |
scripts/generate.js |
same → customers[].team[], .contacts[] |
per-account team + contacts blocks |
| Partners | msx-mcp partnersByAccount |
scripts/generate.js |
same → customers[].partners[] |
per-account partners block |
| MSXI Copilot PRU / Chat MAU / AVD-W365 | msxi-mcp via AIBS (--c360 flag only) |
scripts/generate.js --c360 |
same → customers[].c360.copilot.{pru,chatMau,avdW365} |
Customer 360 cards (when populated; note: snapshots generated without --c360 show "no data" on these tiles) |
| Agreements + geography + industry | msx-mcp agreementsByAccount + ai-sales-kit (--c360 flag) |
scripts/generate.js --c360 |
same → customers[].c360.{agreements,geography,industry} |
per-account header + renewals callout |
1.2 Lynx (Microsoft internal Copilot telemetry portal)¶
| Source | URL | Script | Output file | Consumers |
|---|---|---|---|---|
| Home pinned-tenants list (the 52 customers Sush has pinned) | https://lynx.office.net/ |
scripts/lynx-home-deep-scrape.mjs |
session-state file (one-off) → feeds lynx-tenant-map-v3.mjs |
matcher only; not weekly |
| MSX↔Lynx alias map (handles cases like "Housing NZ Corp" → "KĀINGA ORA") | n/a (manual + matcher) | scripts/lynx-tenant-map-v3.mjs |
data/lynx-tenant-map.json |
every Lynx scraper + lynx-loader-v2 |
Portfolio rollup — /reports/your-tenants (19 cols: licenses, MAU, paid/free Chat, sentiment, web-search status) |
https://lynx.office.net/reports/your-tenants |
scripts/lynx-portfolio-rollup-scrape-v2.mjs |
data/lynx-snapshots/<date>/portfolio-rollup.json |
lynx-loader-v2 → lynx.rollup → per-account "Paid Copilot Adoption" + dashboard funnel |
| Per-tenant usage page (per-app MAU/WAU/DAU breakdown) | https://lynx.office.net/tenant/<guid>/m365_copilot-app_usage |
scripts/lynx-tenant-scrape.mjs |
data/lynx-snapshots/<date>/<tpid>.json (per tenant) |
lynx-loader-v2 → lynx.{headlineKpis,apps} → C360 card + per-account per-app table |
| Admin Settings via cohort report search (legacy path, 52 tenants) | https://lynx.office.net/reports/copilot-settings-cohort?tab=cohort (filter by GUID via search input) |
scripts/lynx-admin-per-tenant-scrape.mjs |
data/lynx-snapshots/<date>/admin-settings-per-tenant.json |
lynx-loader-v2 legacy fallback → lynx.adminSettings |
| Admin Configs tab (NEW v1.0.1 path, per-tenant page tab click) | https://lynx.office.net/tenant/<guid>/m365_copilot-app_usage → click "Admin Configs" tab → click "All" view |
scripts/lynx-admin-tab-scrape.mjs |
merged into data/lynx-snapshots/<date>/<tpid>.json as adminConfigs.values |
lynx-loader-v2 preferred path → lynx.adminSettings → per-account admin chips |
| Usage Overview tab (NEW v1.1.0 path, per-tenant M365 base MAU) | https://lynx.office.net/tenant/<guid>/usage-overview?workloads=["OfficeClient"] (direct nav; no click needed) |
scripts/lynx-usage-overview-scrape.mjs |
merged into data/lynx-snapshots/<date>/<tpid>.json as usageOverview.values |
lynx-loader-v2 → lynx.m365CoreAppMau + lynx.m365CopilotAppMau → penetration % for 51/52 tenants (vs 18 from home-rollup WAU). Per-app MAU for Word/Excel/PPT/Outlook/Teams/SharePoint/OneDrive/Exchange/M365 Copilot App also captured |
1.3 What's NOT scraped (deliberate)¶
- No ACR / MACC / Azure consumption — out of scope (Standing Rule #3)
- No public Microsoft 365 admin center — Lynx is the authorised internal source
- No Yammer / Viva / Defender / Purview — out of scope until SC-500 era
2. Execution order + dependency DAG¶
┌─ az login (manual, weekly)
└─ Edge :9222 + Lynx signed-in + MSXI signed-in (manual, daily/per-session)
│
▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│ STEP 1 generate.js (--c360 if Edge up) │
│ ─── MSX customer/opp/team/contacts + (optional) MSXI PRU/AVD/agreements │
│ ─── ~40s to ~6 min depending on --c360 flag │
│ ─── Output: data/snapshot-customers-<date>.json │
└──────────────────────────────────────────────────────────────────────────────────┘
│
┌──────────────────────────┼──────────────────────────────────────────┐
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────────────┐ ┌────────────────────────────────────────────┐
│ STEP 2 │ │ STEP 3 │ │ STEP 4 (NEW v1.0.1) │
│ lynx-tenant-scrape │ │ lynx-portfolio-rollup-v2 │ │ lynx-admin-tab-scrape OR (legacy) │
│ per-tenant usage │ │ /reports/your-tenants rollup│ │ lynx-admin-per-tenant-scrape │
│ ~20s × 52 = ~18 min │ │ ~3 min │ │ ~22s × 52 = ~22 min (with cooldowns) │
└─────────────────────┘ └─────────────────────────────┘ └────────────────────────────────────────────┘
│ │ │
└──────────────────────────┴──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│ STEP 5 build.mjs │
│ ─── re-render dist/index.html + dist/account/<tpid>.html × 54 │
│ ─── merge happens here via lynx-loader-v2.mergeLynxDataIntoSnapshot() │
│ ─── ~1s │
└──────────────────────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────────────────┐
│ STEP 6 QA suite (see § 3) │
│ ─── lynx-verify-dashboard / smoke-e2e / screenshot-v044-account │
│ ─── ~30s │
└──────────────────────────────────────────────────────────────────────────────────┘
2.1 What refresh-full.mjs orchestrates today (v1.0.2)¶
// scripts/refresh-full.mjs chain
1. generate-msx (Step 1, with --c360 if Edge up)
2. lynx-per-tenant (Step 2)
3. lynx-portfolio-rollup (Step 3)
4. lynx-admin-settings (Step 4a — legacy cohort-search path, kept as fallback)
5. lynx-admin-tab (Step 4b — v1.0.1+ canonical per-tenant tab path)
6. build-dashboard (Step 5)
7. refresh-qa --verbose (Step 6 — 18 sentinel checks, exits 1 on any fail)
v1.0.2 hardening:
- refresh-full.mjs now FAILS FAST when Edge CDP is unavailable AND --skip-lynx wasn't passed. Override with --allow-stale-lynx only when you know the prior Lynx snapshot is current enough.
- lynx-admin-tab-scrape.mjs exits 2 when errors > 0 OR successful < 48. Override with --allow-partial only for explicit known-degradation scenarios.
- All admin-chip classification (account page + dashboard card + detail panel) flows through src/admin-chip-shared.js — fix once, three surfaces stay in sync.
Not yet in refresh-full.mjs: none. As of v1.0.2 the chain is complete end-to-end.
2.2 One-off scripts (not weekly)¶
These run only when the underlying input changes:
| Trigger | Script | What it updates |
|---|---|---|
| Sush adds/removes customers from his MSX book | lynx-home-deep-scrape.mjs then lynx-tenant-map-v3.mjs |
data/lynx-tenant-map.json |
| Sush adds new tenants to his Lynx home page pins | same as above | same |
| Lynx renames a tenant (e.g. ACC → ACC NZ) | edit ALIASES array in lynx-tenant-map-v3.mjs then re-run |
data/lynx-tenant-map.json |
3. QA + SME validation per scrape¶
Each scrape has a corresponding QA gate. Run them in this order; stop and investigate if any fail.
3.1 After STEP 1 — generate.js¶
# A. File written, sensible size, expected customer count
$f = (Get-ChildItem data\snapshot-customers-*.json | Sort-Object Name -Descending)[0].FullName
$j = Get-Content $f -Raw | ConvertFrom-Json
# Expect: 50-60 customers (Sush's book grows ~1/qtr), 1200-1500 opps
"customerCount: $($j.customerCount) (expect 50-60)"
"totalOpps: $($j.summary.totalOpps) (expect 1200-1500)"
# B. Every customer has a TPID (without TPID, no Lynx match possible)
$noTpid = ($j.customers | Where-Object { -not $_.tpid }).Count
"customers without TPID: $noTpid (expect 0)"
# C. Sush's 3 SSPs are all detected
@("Tam Bagnall", "Riki Plester", "Ben Brown") | ForEach-Object {
$count = ($j.customers | Where-Object { ($_.team + $_.opps.owner) -contains $_ }).Count
"$_ -> tagged on $count customers (expect 14-22)"
}
Red flags: customerCount drops below 50 (book purge?), totalOpps drops >20% week-over-week (filter bug?), any SSP unmatched (qualifier2 regex broke?).
3.2 After STEP 2 — lynx-tenant-scrape.mjs¶
$d = "data\lynx-snapshots\$((Get-Date).ToString('yyyy-MM-dd'))"
# A. ≥50 per-tenant files, each >3KB
$files = Get-ChildItem $d -Filter "*.json" | Where-Object { $_.Name -match '^\d+\.json$' }
$good = ($files | Where-Object { $_.Length -gt 3000 }).Count
"per-tenant files: $($files.Count) total, $good >3KB (expect 50+/52)"
# B. A sentinel tenant has expected data (ASB Bank TPID 1393693)
$asb = Get-Content "$d\1393693.json" -Raw | ConvertFrom-Json
"ASB kpiTiles: $($asb.kpiTiles.Count) (expect >40)"
"ASB appRows: $($asb.appRows.Count) (expect 8-12)"
"ASB has Copilot All Up: $($asb.appRows | Where-Object { $_.app -eq 'Copilot All Up' } | Measure-Object).Count -gt 0"
Red flags: any tenant file shrunk >50% in size vs last week (Lynx data dropped silently?), ASB's headline KPIs missing (scraper selector broke?), >5 errors in the index.json (Lynx session degradation).
3.3 After STEP 3 — lynx-portfolio-rollup-scrape-v2.mjs¶
$rollup = Get-Content "$d\portfolio-rollup.json" -Raw | ConvertFrom-Json
"yourTenants records: $($rollup.yourTenants.records.Count) (expect 50-55)"
"homeTopTenants records: $($rollup.homeTopTenants.records.Count) (expect 18-25)"
# Sentinel: ASB rollup row has Copilot Licenses + MAU
$asbRollup = $rollup.yourTenants.records | Where-Object { $_.tenantName -match 'ASB' }
"ASB Copilot Licenses: $($asbRollup.'Copilot Licenses') (expect ~5,000)"
"ASB Copilot Enabled MAU: $($asbRollup.'Copilot Enabled MAU') (expect ~2,000)"
Red flags: record count drops below 50 (virtualized-scroll regression — scraper missed pages), sentinel ASB MAU drops >30% (real signal worth investigating with Sush BEFORE shipping).
3.4 After STEP 4 — admin scrape (either path)¶
# Legacy path output:
$adminLegacy = "$d\admin-settings-per-tenant.json"
if (Test-Path $adminLegacy) {
$a = Get-Content $adminLegacy -Raw | ConvertFrom-Json
"legacy admin records: $($a.records.Count) (expect 50-52)"
$asbAdmin = $a.records | Where-Object { $_.tenantGuid -eq '5cb9fead-91c6-4e06-b693-1a224ecb6412' }
"ASB Anthropic: '$($asbAdmin.'Anthropic as a Sub Processor')' (expect 'Enabled')"
"ASB Frontier: '$($asbAdmin.'Copilot Frontier')' (expect 'Disabled')"
}
# New v1.0.1 path output: lives INSIDE each per-tenant JSON
$asbNew = Get-Content "$d\1393693.json" -Raw | ConvertFrom-Json
if ($asbNew.adminConfigs) {
"new-path ASB Anthropic: '$($asbNew.adminConfigs.values.anthropic)' (expect 'Enabled')"
"new-path ASB Frontier: '$($asbNew.adminConfigs.values.frontier)' (expect 'Disabled')"
}
Sniff tests against known truths (refresh quarterly with Sush): - ASB Anthropic = Enabled - ASB Frontier = Disabled - Spark NZ + One NZ + Ryman + Vista + Zespri + Z Energy → Frontier = Enabled - ACC, Air NZ, IRD, MPI, MBIE, MoJ, NZ Police → Anthropic = Disabled
If any sentinel flips without an explanation in the Lynx changelog, flag to Sush before shipping.
3.5 After STEP 5 — build.mjs¶
# A. dist/index.html exists, sensible size
$idx = "dist\index.html"
"dist/index.html size: $((Get-Item $idx).Length / 1MB) MB (expect 5-7 MB)"
# B. dist/account/ has 54 pages
$accounts = Get-ChildItem dist\account -Filter "*.html" | Measure-Object
"per-account pages: $($accounts.Count) (expect 54)"
# C. Inline data embedded with Lynx
$content = Get-Content $idx -Raw
"adminSettings entries in embedded JSON: $(([regex]'""adminSettings""').Matches($content).Count) (expect 50-52)"
"c360 entries: $(([regex]'""c360"":').Matches($content).Count) (expect 54)"
Red flags: index.html < 4 MB (data not embedded?), <54 account pages (renderer crashed mid-loop?), 0 adminSettings (loader broken?).
3.6 After STEP 6 — full QA suite¶
# Run all three; ALL must pass with exit 0
node scripts/lynx-verify-dashboard.mjs # cards + admin posture
node scripts/screenshot-v044-account.mjs # 15 chips on ASB with expected distribution
node scripts/smoke-e2e.mjs # 7 tabs, 54 account pages, no console errors
Sentinel for ASB chip distribution (from v1.0.1): 5 on · 1 partial · 6 off · 0 low · 3 neutral. If this skews (e.g. 11 on / 0 off), chipKind regressed or admin data corrupted.
4. New-data vs stale-data delta check¶
Weekly refresh is supposed to MOVE numbers, but big swings without explanation are usually scraper bugs. Compare the new snapshot against the previous one before shipping.
cd C:\ssClawy\atlas-portfolio
$today = (Get-ChildItem data\snapshot-customers-*.json | Sort-Object Name -Descending)[0]
$prev = (Get-ChildItem data\snapshot-customers-*.json | Sort-Object Name -Descending)[1]
$t = Get-Content $today.FullName -Raw | ConvertFrom-Json
$p = Get-Content $prev.FullName -Raw | ConvertFrom-Json
"customers: $($p.customerCount) -> $($t.customerCount) ($($t.customerCount - $p.customerCount))"
"open opps: $($p.summary.totalOpps) -> $($t.summary.totalOpps) ($($t.summary.totalOpps - $p.summary.totalOpps))"
"pipeline: `$$([math]::Round($p.summary.totalValue/1e6,1))M -> `$$([math]::Round($t.summary.totalValue/1e6,1))M"
For Lynx data, compare per-tenant MAU week-over-week (top 5 swingers, both directions). Significant unexplained swings = scrape that scraped a different chart accidentally OR a real telemetry event worth flagging to Sush as a sales signal.
# Example pattern (write a proper script later; this is the gist)
foreach ($tpid in @('1393693','1369572','1393693')) { # ASB, Auckland Council
$cur = Get-Content "data\lynx-snapshots\$today\$tpid.json" -Raw | ConvertFrom-Json
$old = Get-Content "data\lynx-snapshots\$prev\$tpid.json" -Raw | ConvertFrom-Json
$curMau = ($cur.appRows | Where-Object { $_.app -eq 'Copilot All Up' }).cells[0]
$oldMau = ($old.appRows | Where-Object { $_.app -eq 'Copilot All Up' }).cells[0]
"$tpid Copilot-All-Up MAU: $oldMau -> $curMau"
}
Decision rule: any single tenant moving >40% week-over-week in MAU, license assignment, or admin posture → STOP, verify in Lynx directly before shipping.
5. Failure-mode catalog¶
Symptoms surfaced by Atlas in past sessions, with the recovery procedure for each.
| Symptom | Likely cause | Fix |
|---|---|---|
connectOverCDP: Timeout 30000ms |
Edge :9222 down or Sush closed Edge | pwsh scripts/launch-edge-cdp.ps1 → wait for Edge → re-sign-in to Lynx + MSXI |
Loading Lynx… screenshot, bodyLen < 1000 |
Lynx SPA never bootstrapped — sustained scraping degraded session | node scripts/lynx-session-health.mjs to confirm; full Edge restart + sign back in |
admin-configs-tab-not-found for a single tenant |
Transient Lynx SPA hiccup | Re-run scrape with $env:RETRY_FAILED=1 — filter only re-attempts tenants with missing adminConfigs.values |
Node exit code 3221226505 (Windows access violation) |
Edge process unstable from sustained scrolling | Close ALL other Edge tabs, reduce scrollPasses constant in the offending scraper from 80 → 30, re-run |
findLatestLynxSnapshotDir returns today's dir but it's empty |
Cohort scrape ran in a new date folder before per-tenant scrape | Loader (v1.0.1+) auto-picks the most recent dir with ≥5 per-tenant files; if you're on an older loader, rename the stub dir |
Dashboard renders c360-card-nolynx for all 54 |
Snapshot generated without --c360 flag AND no Lynx data in latest snapshot dir |
Use --c360 next generate, OR fall back: rename today's snapshot aside, build picks yesterday's (which still has c360); Lynx merge fills in admin/usage at render time |
| Admin chips classified wrong (e.g. "Not Assigned" → green/on) | chipKind() ordering bug |
Fixed in v1.0.1. If it regresses, check partial is checked BEFORE enabled, \bnot\b BEFORE positive verbs |
Compound values display as 9,528 of 9,528100% |
prettyValue() formatter bypassed |
Verify account-page.js prettyValue is called and uses the X/Y arithmetic verification |
| Per-tenant MAU drops >50% week-over-week silently | Lynx scraped a different chart (e.g. WAU not MAU), or telemetry outage | Open https://lynx.office.net/tenant/<guid>/m365_copilot-app_usage MANUALLY, eyeball the headline MAU. If real → it's news, tell Sush. If scraper bug → identify which selector drifted, patch, re-scrape that tenant |
| Cohort search returns 0 NZ tenants | Cohort filter persisted across runs OR Lynx changed the report path | Reapply "Clear all" filter, OR switch to per-tenant tab scraper (the canonical path going forward) |
6. Weekly refresh tracker — copy into the session journal¶
Atlas copies this block at the start of any weekly refresh session, ticks the boxes, pastes outputs inline. Forms an audit trail.
### Atlas-portfolio weekly refresh · YYYY-MM-DD · session <sid>
**Prereqs**
- [ ] Edge running on :9222 (`curl http://127.0.0.1:9222/json/version` returns JSON)
- [ ] Signed in to https://lynx.office.net/
- [ ] Signed in to https://msxinsights.microsoft.com/
- [ ] `az login` valid (last run within 7d)
**Step 1 — MSX + (optional) MSXI**
- [ ] `node scripts/refresh-full.mjs --verbose` started at HH:MM
- [ ] generate-msx step OK (no auth errors)
- [ ] customerCount in 50-60 range: __
- [ ] totalOpps in 1200-1500 range: __
- [ ] All 3 SSPs detected (Tam/Riki/Ben): __
**Step 2 — Per-tenant Lynx usage**
- [ ] lynx-per-tenant step OK, errors < 5
- [ ] Per-tenant files (>3KB): __ of 52
- [ ] ASB sentinel: kpiTiles __ , appRows __
**Step 3 — Portfolio rollup**
- [ ] yourTenants records: __ (expect 50-55)
- [ ] homeTopTenants records: __ (expect 18-25)
- [ ] ASB Copilot Licenses ~5,000: __
**Step 4 — Admin (legacy OR new-tab path)**
- [ ] ASB Anthropic = Enabled: __
- [ ] ASB Frontier = Disabled: __
- [ ] Spark/One NZ/Ryman Frontier = Enabled: __
- [ ] ACC/Air NZ Anthropic = Disabled: __
**Step 5 — Build**
- [ ] dist/index.html written (size: __ MB)
- [ ] dist/account/ has 54 pages: __
- [ ] adminSettings count in embedded JSON: __ (expect 50-52)
**Step 6 — QA**
- [ ] `lynx-verify-dashboard.mjs` exit 0
- [ ] `screenshot-v044-account.mjs` ASB chips: __ on / __ partial / __ off / __ neutral (expect 5/1/6/3)
- [ ] `smoke-e2e.mjs` exit 0 (7 tabs, 54 pages, 0 console errors)
**Delta vs last week**
- [ ] customer count change: __
- [ ] opp count change: __
- [ ] pipeline value change: __
- [ ] Any single tenant moved >40% MAU? __ (if yes, list: ____)
**Sign-off**
- [ ] All sentinels pass → message Sush "refresh done · v1.0.x at dist/index.html"
- [ ] Any sentinel fails → DO NOT message; surface in journal + ask
7. When you add a new scraper / data source / surface — update this file FIRST¶
Order of operations for adding a new data source:
- Probe — write a throwaway probe script that proves the data exists and the shape (Rule #6)
- Add a row to § 1 Source-of-truth map
- Update § 2 DAG if it introduces a new dependency
- Add QA checks to § 3 — at minimum: file exists, size sane, one sentinel tenant value matches a known truth
- Add to § 4 delta check if the data moves week-over-week
- Add a failure-mode row to § 5 for the most likely failure
- Update § 6 tracker template with new checkboxes
- Add the script to
refresh-full.mjsif it should run weekly - Add an inline trigger in
triggers.mdso future Atlas sessions read this file when the keyword fires - Test the full refresh end-to-end before committing
Cross-references¶
atlas-portfolio-architecture.md— the why (Path B static-HTML decision, predecessors PAC/Atlas CC, stack)~/.copilot/atlas-portfolio-phases.md— version-by-version phase queue (active development)BUILD-LOG.md— phase-by-phase change log in the repoarchitecture-and-data-first-playbook.md— Rules #5 + #6 (architecture gate + data-first sequence)parallel-git-rules.md— explicit-paths git rule (nogit add .ever)