## M01 — Monorepo skeleton (done) **Built:** repo layout per SPEC §11, both Dockerfiles, compose stack, toolchain. **Notes for next milestone:** - DB schema empty; M02 owns all tables and seeds. - `entrypoint.sh` for api supports `migrate` mode and calls `vendor/bin/phinx`. - Healthcheck payloads are stubs; later milestones extend them. - Service-token bootstrap deferred to M03 (needs `api_tokens` table first). - CI runs locally via `./scripts/ci.sh` (Docker-based, no host PHP/Node needed). No GitHub Actions workflow per project decision. - `composer.json` config pins `platform.php` to 8.3 in both subprojects so dependency resolution matches the FrankenPHP runtime image even when the build host's `composer:2` image ships a newer PHP. **Deviations from SPEC:** none. **Added dependencies beyond SPEC §2:** none. ## M02 — Database & migrations (done) **Built:** all SPEC §4 tables; idempotent seeds; IP/CIDR value objects. **Schema notes for next milestone:** - `users.password_hash` is NOT in the schema (per SPEC §4; UI owns local-admin credentials). - `api_tokens.kind` enum values: `reporter`, `consumer`, `admin`, `service` (CHECK constraint enforced on both SQLite and MySQL: kind=reporter→reporter_id set & consumer_id null; kind=consumer→consumer_id set & reporter_id null; kind∈{admin,service}→both null). - All timestamps stored UTC. ISO 8601 strings on SQLite, `DATETIME(6)` on MySQL. Default `CURRENT_TIMESTAMP` / `CURRENT_TIMESTAMP(6)` accordingly. - `ip_bin` always 16 bytes; v4 mapped to `::ffff:0:0/96`. Use `App\Domain\Ip\IpAddress::fromString()` for normalization and `Cidr::fromString()` for subnets. Internally CIDRs store v4 prefixes as `96 + originalPrefix` for unified containment math. - DBAL `Connection` is wired through `App\App\Container::build()` and applies the four SQLite PRAGMAs (`journal_mode=WAL`, `synchronous=NORMAL`, `busy_timeout=5000`, `foreign_keys=ON`) on every new SQLite connection. - Phinx migrations extend `App\Infrastructure\Db\Migrations\BaseMigration` for adapter-aware timestamp/binary column helpers. The phinxlog table is unaffected. **Decisions made:** - FK `ON DELETE` semantics: - `policy_category_thresholds.policy_id` → CASCADE (thresholds belong to policy). - `policy_category_thresholds.category_id` → RESTRICT (cannot drop a category in active use). - `consumers.policy_id` → RESTRICT (cannot drop a policy in active use). - `reporters/consumers/manual_blocks/allowlist.created_by_user_id` → SET NULL (preserve provenance after user delete). - `api_tokens.{reporter_id,consumer_id}` → CASCADE (deleting a reporter/consumer revokes its tokens). - `reports.{category_id,reporter_id}` → RESTRICT (preserve audit trail per SPEC hint). - `ip_scores.category_id` → CASCADE (scores meaningless without their category). - `api_tokens` is created via raw `CREATE TABLE` per adapter so the CHECK constraint on `kind` works on SQLite (which cannot ADD CHECK via ALTER TABLE) and on MySQL. - `BINARY(16)` on MySQL is implemented as Phinx's portable `binary` type with `limit => 16` (yields `VARBINARY(16)`); this is functionally identical for our fixed-width 16-byte payload and avoids per-adapter raw SQL. - Fixed an M01 bug in `config/phinx.php` where `rtrim($path, '.sqlite')` mangled the SQLite path because `rtrim`'s second arg is a character set; switched to passing the full path verbatim with empty `suffix`. **Deviations from SPEC:** none. **Added dependencies:** none beyond SPEC §2. ## M03 — API auth foundations (done) **Built:** token kinds, hashing, RBAC, impersonation pattern, auth endpoints, service token bootstrap. **API contract decisions:** - 401 = bad/expired/revoked/wrong-kind token (uniform body `{"error":"unauthorized"}`) - 403 = authenticated but wrong role - 400 = service token without (or malformed) `X-Acting-User-Id` header - `last_used_at` updated synchronously (move to async in M14 if perf demands) - `/api/v1/auth/*` is service-token-only with **no impersonation** — these endpoints exist to bootstrap user records the UI can later impersonate, so requiring impersonation would be circular. The controller enforces `kind=service` directly. - `X-Acting-User-Id` is silently ignored on non-service tokens (per SPEC §8); only its absence on a *service* token triggers 400. **Notes for next milestone:** - Reporter and consumer tokens have no role column; their auth carries `reporter_id` / `consumer_id` only. Reading `principal->reporterId` from request attrs is how M04's report endpoint will identify the reporter. - Admin endpoints in later milestones can use `RbacMiddleware::require($responseFactory, Role::Operator)` etc. — the factory takes the role; the response factory is in the container. - `AuthenticatedPrincipal` carries an optional `userId` so M14 can introduce admin-token-bound-to-user without churn. **Schema deviation:** `api_tokens.role` (nullable VARCHAR(32)) was added in migration `20260428130000_add_role_to_api_tokens.php`. SPEC §4 doesn't enumerate it but SPEC §6 mandates that admin tokens carry a role; the column stores it. Non-admin token rows leave it `NULL`. **Token format:** `irdb__<32 base32 chars>`, where `kind3` is one of `rep|con|adm|svc`. 160 bits of entropy from `random_bytes(20)`. The whole raw string is SHA-256 hashed for storage; `token_prefix` keeps the first 8 chars (`irdb_`) for log readability. The `.env.example` documents how to generate a valid `UI_SERVICE_TOKEN` via `TokenIssuer`. **Service-token rotation:** out of scope this milestone — `ServiceTokenBootstrap` only handles "set or not set". Rotation means: deploy with the new value, restart api, manually revoke the old hash via a future tool. The bootstrap logs a warning when it inserts a new service token while another already exists. **Added dependencies:** none. ## M04 — Token system & ingest (done) **Built:** reporter/consumer/token CRUD; POST /api/v1/report end-to-end; rate limiter; decay functions. **Notes for next milestone:** - Synchronous score updates are correct but only touch the (ip, category) pair just reported. Bulk decay re-application is M05's recompute job. - `PairScorer` (`api/src/Domain/Reputation/PairScorer.php`) is the authoritative single-pair scorer; the bulk recompute job in M05 should call into it (or a near-clone) so behavior stays consistent. It depends on `Clock`, `CategoryRepository`, and `ReportRepository::forScoring()`. - Decay shapes live as pure functions in `Decay::value(DecayFunction, ageDays, decayParam)` with seven unit tests against hand-computed reference values. M05's recompute will reuse this. - Rate limiter is in-process (PHP array on a singleton `RateLimiter`); document this in README. Multi-replica deployments need a shared store. The bucket capacity is `API_RATE_LIMIT_PER_SECOND × 2` with refill = `API_RATE_LIMIT_PER_SECOND` per second; on exhaustion the middleware emits 429 with `Retry-After: 1`. Skipped on admin/auth routes. - Service tokens cannot be created via the admin API (`kind=service` → 400) and are filtered out of the list endpoint unconditionally; only the bootstrap path makes them. Revoke on a service token returns 403 from `DELETE /api/v1/admin/tokens/{id}`. - Tokens raw value appears **only** in the create response payload (`raw_token`); we persist its SHA-256 hash and the 8-char prefix. - `ip_scores` upsert is per-driver: SQLite uses `ON CONFLICT(ip_bin, category_id) DO UPDATE`, MySQL uses `ON DUPLICATE KEY UPDATE`. Single helper in `IpScoreRepository::upsert()`. - `Clock` interface (`App\Domain\Time\Clock`) wraps wall-time for `received_at`, decay age, and rate-limit refill. `SystemClock` in production; `FixedClock` (with `advance()`) in tests. **API contract decisions:** - Admin endpoints (`/api/v1/admin/{reporters,consumers,tokens}`) require `Admin` role. RBAC is enforced via `RbacMiddleware::require($rf, Role::Admin)` on the route group. - Validation errors return `400` with `{"error":"validation_failed","details":{"field":"reason"}}`. Hand-rolled validators per controller — small surface, no third-party validator added. - DELETE on a reporter with existing reports returns `409` and flips `is_active=false` (soft delete) rather than removing the row; the audit trail is preserved per the FK RESTRICT semantics on `reports.reporter_id`. - Public `POST /api/v1/report` — wrong-kind tokens (admin/consumer/service) and inactive reporters both return `401` with the uniform `{"error":"unauthorized"}` envelope, matching the M03 convention. Bad IP / unknown category / oversized metadata return `400` with the validation envelope. - Metadata size limit: 4 KB after `json_encode`. Non-object metadata (arrays, scalars) is rejected. **Deviations from SPEC:** none. **Added dependencies:** none (chose hand-rolled validation over `respect/validation`). ## M05 — Reputation engine & jobs (done) **Built:** decay math (already in M04, edge-cases re-verified); job framework with atomic locks (`JobLockRepository`), run history (`JobRunRepository`), runner abstraction (`JobRunner`), registry (`JobRegistry`); concrete jobs `RecomputeScoresJob` (full + incremental), `CleanupAuditJob`, `EnrichPendingJob` (skeleton); `TickJob` dispatcher; `/internal/jobs/{recompute-scores,cleanup-audit,enrich-pending,tick,refresh-geoip,status}` endpoints behind `InternalNetworkMiddleware` + `InternalTokenMiddleware`; CLI `jobs:run`, `jobs:status`, `scores:rebuild`. **Notes for next milestone:** - `PairScorer` (from M04) is reused by `RecomputeScoresJob` — both produce identical scores for the same pair. - `EnrichPendingJob` is a skeleton — M11 fills it in. - `refresh-geoip` endpoint returns 412 with `{"error":"not_implemented"}` — M11 wires it up. - Job results are returned synchronously; long jobs may exceed default request timeout. The `/internal/*` routes need an extended `max_execution_time` in production FrankenPHP config (deferred — current default is sufficient for the recompute's 240s ceiling). - Drop rule: `score < 0.01 AND last_report_at < now − 90 days`. RecomputeScoresJob backdates `last_report_at` to `now − 366 days` for orphan ip_scores rows (no surviving reports) so the same drop pass prunes them. - `triggered_by` convention: HTTP `/internal/jobs/*` calls use `'schedule'` (assumed cron-driven); CLI uses `'manual'`. The admin-API wrapper in M12 will pass `'manual'` through for UI button triggers. - TickJob takes a `Closure(): iterable` rather than a direct `JobRegistry` reference — needed to break a build-time cycle in PHP-DI (registry holds tick; tick iterates registry). The closure is invoked at run time. - `JobsController` resolves jobs via `JobRegistry::get($name)`, and the registry is populated lazily in the container factory in registration order: recompute, cleanup, enrich, tick. - Lock owner format: `/`. Release verifies owner matches before deleting — defensive against expires_at-reclaim races. - Internal token middleware fails closed when `INTERNAL_JOB_TOKEN` is empty — better than silently exposing endpoints to anything inside the docker network. **Deviations from SPEC:** none. **Added dependencies:** none. ## M06 — Manual blocks, allowlist (done) **Built:** CRUD for `manual_blocks` and `allowlist` (single-IP and CIDR, v4 + v6); CidrEvaluator (in-process containment over a snapshot); CidrEvaluatorFactory (60s TTL cache + invalidate on writes); EffectiveStatusService (allowlist + manual; score+policy lands in M07); SPEC §M06 acceptance script passes end-to-end. **Notes for next milestone:** - M07 wires `CidrEvaluatorFactory` into the distribution endpoint and finishes `EffectiveStatusService` by adding score-vs-policy evaluation. Inject `CategoryRepository`, `IpScoreRepository`, and the per-policy thresholds into the service alongside the existing evaluator. - Cache TTL is `CIDR_EVALUATOR_TTL_SECONDS` (default 60s); mutation endpoints invalidate explicitly **and** force a synchronous rebuild (`get()`) so an overlap WARNING fires inside the same request — operators see immediate feedback. Multi-replica deployments will see up to 60s of staleness across replicas — accepted. - Manual-block expiration cleanup: data model has `expires_at`, repo has `findExpired($now)` returning ids, but no job runs. Add in M14 hardening if desired, or leave as a documented limitation. - CIDR canonicalization picks recommendation (c) from the milestone doc: non-canonical input is silently normalized; the response body echoes `normalized_from: ` only when the normalization changed the input. Canonical input omits the field. - Repository inserts go through `RepositoryBase::insertRow()` for the binary-column ergonomics, but `insertRow()` returns `executeStatement()`'s row count — not the new id. The repos call `(int) $this->connection()->lastInsertId()` after `insertRow()` to recover the id. Same pattern `ReportRepository::insert` uses — kept consistent. - `Cidr::fromBinary($networkBin, $unifiedPrefix)` was added so repositories can hydrate stored rows back into the value object. The v4-vs-v6 heuristic mirrors what `IpAddress::fromBinary` does (v4-mapped IPv6 prefix + unified prefix ≥ 96 ⇒ render as v4). - `CidrEvaluatorFactory` is intentionally *not* `final` — `EffectiveStatusServiceTest` substitutes an in-memory stub via subclass to avoid spinning up the DB. - RBAC split per SPEC §6: list/show ⇒ Viewer, create/delete ⇒ Operator. Achieved with per-route `RbacMiddleware::require(...)` rather than group-level — a small departure from the all-Admin pattern used by reporters/consumers/tokens but the cleanest expression of "the same URL has different role requirements per method". **Deviations from SPEC:** none. **Added dependencies:** none. ## M07 — Policies & distribution (done) **Built:** policy CRUD with thresholds (replaces wholesale on PATCH); `GET /api/v1/blocklist` (text/plain + JSON) with ETag/If-None-Match round-trip; per-policy in-memory cache (30s TTL, invalidated on relevant mutations); BlocklistBuilder with allowlist filtering, manual-block dedup (broader CIDR wins), v4-then-v6 stable sort; per-policy preview endpoint; perf test 50k entries <500 ms (SQLite + JIT). **Notes for next milestone:** - Per-policy cache TTL = 30 s (`BLOCKLIST_CACHE_TTL_SECONDS`). Mutation endpoints invalidate explicitly: policy CRUD calls `BlocklistCache::invalidate($policyId)`; manual_blocks / allowlist mutations call `invalidateAll()` (any policy might include manual blocks). Multi-replica deployments will see up to 30 s of cross-replica staleness — accepted, mirrors `CidrEvaluatorFactory` semantics. - The text/plain format is universal (one IP/CIDR per line, no comments). Firewall-specific consumers transform on their side; M13 ships examples in `examples/consumers/`. - DELETE on a policy with referencing consumers returns 409 with `{"error":"policy_in_use","consumers":[{id,name},...]}`. Cascade is wrong here per SPEC §M07.2. - Dedup rule: scored single-IPs covered by a manual subnet are dropped (the broader subnet entry covers them). For same-IP overlap (scored single AND manual single), the scored entry wins to keep category attribution. - Allowlist precedence: a manual subnet whose network address sits inside an allowlisted IP/subnet is dropped from the output. Manual single IPs on the allowlist are filtered too. The `CidrEvaluator` already logs a WARNING when the two lists overlap. - ETag stability: SHA-256 over the rendered body (excluding `generated_at`). Different content-types yield different ETags by design (text vs JSON have different bodies). - `If-None-Match` parsing handles weak validators (`W/"…"`) and the wildcard `*`. - Policies controller's PATCH replaces the threshold set wholesale inside a single transaction (`PolicyRepository::replaceThresholds` — DELETE then INSERT). Field-level edits to name/description/include_manual_blocks happen alongside in the same request when present. - Threshold body shape: `{: }`; the controller resolves slugs to category ids. Unknown slug returns a 400 with the offending slug in the error message. - `BlocklistBuilder` exposes the build via `BlocklistCache::getOrBuild($policy)`; the public endpoint never builds directly. Preview endpoint bypasses the cache (calls the builder directly) so the UI sees fresh numbers after edits. - `IpScoreRepository::findExceedingThresholds` returns raw associative-array rows (not typed) — the BlocklistBuilder's hot loop casts on demand. Saves ~25 % off the perf budget at 50k rows. **Performance:** - SPEC §M07.5 budget: 50k entries < 500 ms. Measured warm path on SQLite + opcache JIT (matches production FrankenPHP): **440–460 ms** across 5 consecutive runs (median ~444 ms). - Without JIT (raw `vendor/bin/phpunit --group perf`) the same workload takes ~530 ms. The `composer test-perf` script enables JIT (`-d opcache.enable_cli=1 -d opcache.jit_buffer_size=64M -d opcache.jit=tracing`) so CI matches the production runtime. - Three key optimisations beat the budget: (a) subnets indexed by prefix length so containment is `applyMaskFast + isset()` rather than per-pair `Cidr::contains()`; (b) `ksort` on binary keys (one per family) instead of `usort` with a closure — closure dispatch dominates at 50k entries; (c) parallel hashes (`ipText`, `categoriesByIp`, `maxScoreByIp`) keyed on `ip_bin` instead of nested `[]` rows, so the row-merge loop avoids the per-iteration nested-array allocation. - MySQL number not yet measured — to be captured separately when the MySQL CI lane is wired up. **Schema:** none — uses the M02 `policies` and `policy_category_thresholds` tables as-is. **Test surface added:** `tests/Unit/Reputation/PolicyEvaluatorTest.php`, `tests/Integration/Admin/PoliciesControllerTest.php`, `tests/Integration/Public/BlocklistControllerTest.php`, `tests/Integration/Reputation/BlocklistBuilderTest.php`, `tests/Integration/Perf/BlocklistPerfTest.php`. Total +28 tests / +95 assertions; perf test excluded from default run via `#[Group('perf')]`. Suite passes 271 tests / 723 assertions, 0 deprecations. **Acceptance script:** ran end-to-end against compose stack. Empty blocklist → 200 with empty body; manual block emits as CIDR; JSON format returns reason="manual"; ETag round-trip returns 304; admin token rejected with 401; preview endpoint returns count + sample for all three seeded policies. **Deviations from SPEC:** - The `migrate` container's entrypoint runs Phinx migrations only; SPEC §10 says it should also run seeds. Pre-existing from M01, surfaced again here because M07's acceptance flow depends on the seeded policies. Worked around for the smoke test by running `vendor/bin/phinx seed:run` against the started container. Flagged for M13 polish (or earlier if another milestone is bitten by it). - `composer test` script now passes `--exclude-group perf` so the default suite is fast; perf is run via `composer test-perf` with JIT enabled to match production. - The PHPUnit doc-comment `@group` annotation was switched to the `#[Group('perf')]` attribute to silence a PHPUnit-12 deprecation warning. **Added dependencies:** none. ## M08 — UI scaffold & auth (done) **Built:** UI base (Slim+Twig+Tailwind+Alpine+htmx with dark-mode FOUC-free toggle and sidebar/topnav), session manager (file-backed PHP sessions, 8h idle / 24h absolute), CSRF middleware (constant-time compare, header + form-field), ApiClient with auto Bearer + `X-Acting-User-Id` + retry-once-on-5xx + typed exceptions, AuthClient + AdminClient subclients, OIDC code-flow with PKCE via `jumbojett/openid-connect-php`, local admin login (Argon2id + 5-fail/30s session-scoped throttle), CSRF-protected logout, `/app/me` page, `/no-access` page, friendly Twig error template, `doc/oidc.md` Entra setup guide. **Notes for next milestone:** - `AdminClient` exposes only `getMe()`; M09 adds methods for IP search/detail, dashboard, etc. - ApiClient exception types are stable: `ApiAuthException` (401/403), `ApiValidationException` (400/422 with `details` array), `ApiNotFoundException` (404), `ApiServerException` (5xx), `ApiUnreachableException` (network/timeout). M09+ catches them in controllers. - Sidebar nav placeholders show their target milestone (M09/M10/M12). Replace each placeholder with a real link as the section ships. - Dark mode persistence: `localStorage.irdb-theme = 'dark' | 'light'`. Inline `` script reads it before paint; the toggle button (`[data-theme-toggle]`) writes it. System preference is the first-visit default via `prefers-color-scheme`. - M14 will replace the basic 5/30 session-scoped throttle with a real brute-force lockout (IP-keyed, persistent). - `regenerateId()` is called after every auth-state change (login success, logout) per SPEC §M08; defeats session fixation. - POST handlers (`/login/local`, `/logout`) return **303 See Other** on redirect so curl/browsers switch to GET. GET handlers stay at 302. - Service-token + impersonation header invariant: every API call out of the UI carries `Authorization: Bearer ` and (when a session user exists) `X-Acting-User-Id: `. Auth endpoints (`/api/v1/auth/*`) are called WITHOUT the impersonation header by design — they exist to produce the user record we'd be impersonating. - `Config::validateOrExit()` runs at boot and exits non-zero if `UI_SERVICE_TOKEN`/`API_BASE_URL` are missing or both auth methods are disabled. This means a misconfigured deployment crashes on `docker compose up`, not on the first user click. - `OidcAuthenticator` is an interface (concrete impl `JumbojettOidcAuthenticator`); tests stub it to drive the OIDC controller's branches without a real IdP. - Healthz remembers the most recent ApiClient call result via `ApiHealth` singleton; both fields are `null` until the UI has made its first API call. - htmx config picks up the per-session CSRF token from the `` tag and sends `X-CSRF-Token` automatically — established now even though htmx isn't used yet, so M09 tables can rely on it. **Manual verification:** - Acceptance script ran end-to-end against compose stack with `LOCAL_ADMIN_ENABLED=true`: login page renders with "Sign in" button, /healthz returns 200, /app/me unauthed redirects to /login, GET /login → set CSRF + session cookie, POST /login/local with valid creds → 303 to /app/me showing "admin" role + "local" source, wrong password rejected with flash, missing CSRF token → 403, logout clears session and post-logout /app/me → 302 again. - OIDC flow against a live tenant: NOT manually verified in this environment — no Entra tenant available. The flow is covered by integration tests with a stubbed `OidcAuthenticator` (success path, no-role/no-access path, handshake failure, api-down during upsert). Real-tenant verification deferred to the next operator with tenant access; `doc/oidc.md` documents the setup. **Operational gotchas observed:** - Docker Compose's `.env` file performs variable substitution on values, so an Argon2id hash containing `$argon2id$v=19$…` collapses unless every `$` is doubled to `$$`. Documented inline in the `.env.example` instructions for M13 polish. - `curl -L` together with `-X POST` does NOT switch the method to GET on a 303 response (the explicit `-X` overrides curl's default). Acceptance scripts and any curl-based tests should use `-d` alone (which implies POST and lets curl follow redirects with GET). **Test surface added (ui):** 47 tests / 102 assertions. - Unit: `ApiClientTest` (status-code mapping, retry-once, header injection, health tracking), `SessionManagerTest` (set/get/clear, flash, throttle, lockout), `CsrfMiddlewareTest` (skip on GET, header + form-field paths, 403 on missing/wrong token). - Integration: `LocalLoginTest` (form render, success, wrong password, wrong username, missing CSRF, lockout after 5 fails, api-down handling), `LogoutTest` (success + missing-CSRF), `OidcFlowTest` (success, no-role → /no-access, handshake failure, api-down), `RoutesTest` (home redirect, healthz, /app/me gating, /no-access). **Deviations from SPEC:** - The TwigGlobalsMiddleware runs ahead of `AuthRequiredMiddleware` so anonymous /app/* requests still have csrf_token / flash / current_user globals available for the redirect-to-/login response — minor implementation detail, no functional difference. - POST handlers return 303 instead of 302 (SPEC says "redirect"; standardised on 303 for state-changing redirects to avoid the curl `-X POST -L` resubmit-method behaviour). **Added dependencies:** - `esbuild` (devDependency, JS bundling for `app.js`). The SPEC §2 doesn't enumerate a JS bundler explicitly but allows "vanilla JS + Alpine.js + htmx where it simplifies forms"; the Tailwind-only build was insufficient since Alpine and htmx are imported modules. The Dockerfile build now runs both `tailwindcss` and `esbuild`. - `jumbojett/openid-connect-php` was already in SPEC §2 / `composer.json`; it's just being USED for the first time in M08.