aidokitwiki

2026-05-25 (week 2) — Workstreams A · B · C · D · F · G · H all closed

This is the cumulative second devlog post, summarising what landed across sessions 2–9 since the first post earlier today. Yes, today — the 18-month plan was supposed to be paced over 6 focused sessions across 4–6 weeks; we ended up running them back-to-back. Honest note on what that means for the next phase at the end.

Headline #

37 atomic commits across nine sessions. All seven engineering workstreams in the original plan are now closed in code (workstream A formally complete; B, C, D, F, G, H all complete in their v1.0 scope with a few intentional polish items deferred or absorbed elsewhere).

The full functional v1.0 capability surface is shipped:

aidokit init --tier starter|standard|strict       # B1, B2, B3, B5, B6
aidokit init --adapter claude-code,codex          # multi-adapter
aidokit add adapter <name>                         # D1, D2
aidokit doctor --drift                             # C7
aidokit verify                                     # G5
aidokit verify --capabilities                      # D4
aidokit manifest --verify-capabilities             # D6
aidokit metrics summary [--since 7d]               # C3
aidokit metrics export [--output file]             # C4
aidokit audit export --format soc2|eu-ai-act       # C5, C6

Plus the three integrity layers:

aidokit verify                     # local install integrity (G5)
aidokit verify --capabilities      # promise vs. reality (D4)
npm audit signatures               # supply-chain provenance (G4)

What shipped per session #

Session 2 (B1 + A4-partial · A3 · C7 · F3). Starter command set (/start, /ship), suppressed Beads / MCP prompts at Starter, doctor --drift, five per-stack landing pages.

Session 3 (H1 · C2 · C3 + C4). Relaxed the byte-compare gate via a named exceptions list, wired the MetricsLog to the watchdog hooks, shipped metrics summary + metrics export.

Session 4 (B2 · G5 · B3 · C8). Tier-aware CLAUDE.md template, aidokit verify, suppressed v4 base skills at Starter, privacy / zero-telemetry policy page.

Session 5 (A4-remaining + ADR-0017). .aidokit/capabilities.json artifact at Strict tier — designed via ADR-0017 first so the Adapter interface didn't need touching.

Session 6 (D4 · C5 · A5-narrow). Capability verifier, SOC2 mapping + audit export command, surgical tier-vocabulary rename in user-facing entry pages.

Session 7 (C6 · G4 · H2). EU AI Act mapping, npm-provenance publish workflow, {adapter × stack × tier} matrix CI.

Session 8 (H3 + F5 · D1 + D2 · D6 · B4). Nightly matrix-health snapshot + wiki page, aidokit add adapter <name>, manifest --verify-capabilities, canonical parser/writer for the flat-file task fallback.

Session 9 (B5 · B6 · D7 · H4 + H5). Guard against existing installs, elapsed-time print at init end with tier-aware Next step, per-adapter capability declarations reference, snapshot-acceptance + quarterly upstream-bump workflows.

Session 10 (F8 · this post). Wiki builder generalised — JSON nav + CLI args. And this post.

What did NOT ship — and why #

The original plan listed a dozen non-engineering items I deliberately deferred. They are still pending and they are still the gating items for the visible, impactful outcome the plan was originally written to produce.

What this means #

The engineering side of the 18-month plan is essentially done. The plan's commercial outcome was always going to depend on the human side — the design partners, the auditor relationships, the community maintainers, the cost study. None of those advanced this week.

Three honest scenarios from here:

  1. Engineering-only stop. Stop shipping code and start the human work. This is the right call if the goal is a real product outcome.
  2. Polish-and-publish. Spend 1–2 sessions on README polish, landing page, demo video script, then start design-partner outreach. Same outcome as (1) with slightly better surface.
  3. Keep coding because it feels like progress. Tempting but counterproductive — we've now built more than the original validated demand justified, and additional features without users make the maintenance burden worse, not better.

The maintainer's choice. The honest recommendation in the original critical product analysis stands: start the user conversations before the code-to-users ratio gets any more lopsided.

Subscribe / get involved #

— maintainer