PROGRESS.md 50 KB

M01 — Monorepo skeleton (done)

Built: repo layout per SPEC §11, both Dockerfiles, compose stack, toolchain.

Notes for next milestone:

  • DB schema empty; M02 owns all tables and seeds.
  • entrypoint.sh for api supports migrate mode and calls vendor/bin/phinx.
  • Healthcheck payloads are stubs; later milestones extend them.
  • Service-token bootstrap deferred to M03 (needs api_tokens table first).
  • CI runs locally via ./scripts/ci.sh (Docker-based, no host PHP/Node needed). No GitHub Actions workflow per project decision.
  • composer.json config pins platform.php to 8.3 in both subprojects so dependency resolution matches the FrankenPHP runtime image even when the build host's composer:2 image ships a newer PHP.

Deviations from SPEC: none. Added dependencies beyond SPEC §2: none.

M02 — Database & migrations (done)

Built: all SPEC §4 tables; idempotent seeds; IP/CIDR value objects.

Schema notes for next milestone:

  • users.password_hash is NOT in the schema (per SPEC §4; UI owns local-admin credentials).
  • api_tokens.kind enum values: reporter, consumer, admin, service (CHECK constraint enforced on both SQLite and MySQL: kind=reporter→reporter_id set & consumer_id null; kind=consumer→consumer_id set & reporter_id null; kind∈{admin,service}→both null).
  • All timestamps stored UTC. ISO 8601 strings on SQLite, DATETIME(6) on MySQL. Default CURRENT_TIMESTAMP / CURRENT_TIMESTAMP(6) accordingly.
  • ip_bin always 16 bytes; v4 mapped to ::ffff:0:0/96. Use App\Domain\Ip\IpAddress::fromString() for normalization and Cidr::fromString() for subnets. Internally CIDRs store v4 prefixes as 96 + originalPrefix for unified containment math.
  • DBAL Connection is wired through App\App\Container::build() and applies the four SQLite PRAGMAs (journal_mode=WAL, synchronous=NORMAL, busy_timeout=5000, foreign_keys=ON) on every new SQLite connection.
  • Phinx migrations extend App\Infrastructure\Db\Migrations\BaseMigration for adapter-aware timestamp/binary column helpers. The phinxlog table is unaffected.

Decisions made:

  • FK ON DELETE semantics:
    • policy_category_thresholds.policy_id → CASCADE (thresholds belong to policy).
    • policy_category_thresholds.category_id → RESTRICT (cannot drop a category in active use).
    • consumers.policy_id → RESTRICT (cannot drop a policy in active use).
    • reporters/consumers/manual_blocks/allowlist.created_by_user_id → SET NULL (preserve provenance after user delete).
    • api_tokens.{reporter_id,consumer_id} → CASCADE (deleting a reporter/consumer revokes its tokens).
    • reports.{category_id,reporter_id} → RESTRICT (preserve audit trail per SPEC hint).
    • ip_scores.category_id → CASCADE (scores meaningless without their category).
  • api_tokens is created via raw CREATE TABLE per adapter so the CHECK constraint on kind works on SQLite (which cannot ADD CHECK via ALTER TABLE) and on MySQL.
  • BINARY(16) on MySQL is implemented as Phinx's portable binary type with limit => 16 (yields VARBINARY(16)); this is functionally identical for our fixed-width 16-byte payload and avoids per-adapter raw SQL.
  • Fixed an M01 bug in config/phinx.php where rtrim($path, '.sqlite') mangled the SQLite path because rtrim's second arg is a character set; switched to passing the full path verbatim with empty suffix.

Deviations from SPEC: none. Added dependencies: none beyond SPEC §2.

M03 — API auth foundations (done)

Built: token kinds, hashing, RBAC, impersonation pattern, auth endpoints, service token bootstrap.

API contract decisions:

  • 401 = bad/expired/revoked/wrong-kind token (uniform body {"error":"unauthorized"})
  • 403 = authenticated but wrong role
  • 400 = service token without (or malformed) X-Acting-User-Id header
  • last_used_at updated synchronously (move to async in M14 if perf demands)
  • /api/v1/auth/* is service-token-only with no impersonation — these endpoints exist to bootstrap user records the UI can later impersonate, so requiring impersonation would be circular. The controller enforces kind=service directly.
  • X-Acting-User-Id is silently ignored on non-service tokens (per SPEC §8); only its absence on a service token triggers 400.

Notes for next milestone:

  • Reporter and consumer tokens have no role column; their auth carries reporter_id / consumer_id only. Reading principal->reporterId from request attrs is how M04's report endpoint will identify the reporter.
  • Admin endpoints in later milestones can use RbacMiddleware::require($responseFactory, Role::Operator) etc. — the factory takes the role; the response factory is in the container.
  • AuthenticatedPrincipal carries an optional userId so M14 can introduce admin-token-bound-to-user without churn.

Schema deviation: api_tokens.role (nullable VARCHAR(32)) was added in migration 20260428130000_add_role_to_api_tokens.php. SPEC §4 doesn't enumerate it but SPEC §6 mandates that admin tokens carry a role; the column stores it. Non-admin token rows leave it NULL.

Token format: irdb_<kind3>_<32 base32 chars>, where kind3 is one of rep|con|adm|svc. 160 bits of entropy from random_bytes(20). The whole raw string is SHA-256 hashed for storage; token_prefix keeps the first 8 chars (irdb_<kind3>) for log readability. The .env.example documents how to generate a valid UI_SERVICE_TOKEN via TokenIssuer.

Service-token rotation: out of scope this milestone — ServiceTokenBootstrap only handles "set or not set". Rotation means: deploy with the new value, restart api, manually revoke the old hash via a future tool. The bootstrap logs a warning when it inserts a new service token while another already exists.

Added dependencies: none.

M04 — Token system & ingest (done)

Built: reporter/consumer/token CRUD; POST /api/v1/report end-to-end; rate limiter; decay functions.

Notes for next milestone:

  • Synchronous score updates are correct but only touch the (ip, category) pair just reported. Bulk decay re-application is M05's recompute job.
  • PairScorer (api/src/Domain/Reputation/PairScorer.php) is the authoritative single-pair scorer; the bulk recompute job in M05 should call into it (or a near-clone) so behavior stays consistent. It depends on Clock, CategoryRepository, and ReportRepository::forScoring().
  • Decay shapes live as pure functions in Decay::value(DecayFunction, ageDays, decayParam) with seven unit tests against hand-computed reference values. M05's recompute will reuse this.
  • Rate limiter is in-process (PHP array on a singleton RateLimiter); document this in README. Multi-replica deployments need a shared store. The bucket capacity is API_RATE_LIMIT_PER_SECOND × 2 with refill = API_RATE_LIMIT_PER_SECOND per second; on exhaustion the middleware emits 429 with Retry-After: 1. Skipped on admin/auth routes.
  • Service tokens cannot be created via the admin API (kind=service → 400) and are filtered out of the list endpoint unconditionally; only the bootstrap path makes them. Revoke on a service token returns 403 from DELETE /api/v1/admin/tokens/{id}.
  • Tokens raw value appears only in the create response payload (raw_token); we persist its SHA-256 hash and the 8-char prefix.
  • ip_scores upsert is per-driver: SQLite uses ON CONFLICT(ip_bin, category_id) DO UPDATE, MySQL uses ON DUPLICATE KEY UPDATE. Single helper in IpScoreRepository::upsert().
  • Clock interface (App\Domain\Time\Clock) wraps wall-time for received_at, decay age, and rate-limit refill. SystemClock in production; FixedClock (with advance()) in tests.

API contract decisions:

  • Admin endpoints (/api/v1/admin/{reporters,consumers,tokens}) require Admin role. RBAC is enforced via RbacMiddleware::require($rf, Role::Admin) on the route group.
  • Validation errors return 400 with {"error":"validation_failed","details":{"field":"reason"}}. Hand-rolled validators per controller — small surface, no third-party validator added.
  • DELETE on a reporter with existing reports returns 409 and flips is_active=false (soft delete) rather than removing the row; the audit trail is preserved per the FK RESTRICT semantics on reports.reporter_id.
  • Public POST /api/v1/report — wrong-kind tokens (admin/consumer/service) and inactive reporters both return 401 with the uniform {"error":"unauthorized"} envelope, matching the M03 convention. Bad IP / unknown category / oversized metadata return 400 with the validation envelope.
  • Metadata size limit: 4 KB after json_encode. Non-object metadata (arrays, scalars) is rejected.

Deviations from SPEC: none. Added dependencies: none (chose hand-rolled validation over respect/validation).

M05 — Reputation engine & jobs (done)

Built: decay math (already in M04, edge-cases re-verified); job framework with atomic locks (JobLockRepository), run history (JobRunRepository), runner abstraction (JobRunner), registry (JobRegistry); concrete jobs RecomputeScoresJob (full + incremental), CleanupAuditJob, EnrichPendingJob (skeleton); TickJob dispatcher; /internal/jobs/{recompute-scores,cleanup-audit,enrich-pending,tick,refresh-geoip,status} endpoints behind InternalNetworkMiddleware + InternalTokenMiddleware; CLI jobs:run, jobs:status, scores:rebuild.

Notes for next milestone:

  • PairScorer (from M04) is reused by RecomputeScoresJob — both produce identical scores for the same pair.
  • EnrichPendingJob is a skeleton — M11 fills it in.
  • refresh-geoip endpoint returns 412 with {"error":"not_implemented"} — M11 wires it up.
  • Job results are returned synchronously; long jobs may exceed default request timeout. The /internal/* routes need an extended max_execution_time in production FrankenPHP config (deferred — current default is sufficient for the recompute's 240s ceiling).
  • Drop rule: score < 0.01 AND last_report_at < now − 90 days. RecomputeScoresJob backdates last_report_at to now − 366 days for orphan ip_scores rows (no surviving reports) so the same drop pass prunes them.
  • triggered_by convention: HTTP /internal/jobs/* calls use 'schedule' (assumed cron-driven); CLI uses 'manual'. The admin-API wrapper in M12 will pass 'manual' through for UI button triggers.
  • TickJob takes a Closure(): iterable<Job> rather than a direct JobRegistry reference — needed to break a build-time cycle in PHP-DI (registry holds tick; tick iterates registry). The closure is invoked at run time.
  • JobsController resolves jobs via JobRegistry::get($name), and the registry is populated lazily in the container factory in registration order: recompute, cleanup, enrich, tick.
  • Lock owner format: <pid>/<random hex>. Release verifies owner matches before deleting — defensive against expires_at-reclaim races.
  • Internal token middleware fails closed when INTERNAL_JOB_TOKEN is empty — better than silently exposing endpoints to anything inside the docker network.

Deviations from SPEC: none. Added dependencies: none.

M06 — Manual blocks, allowlist (done)

Built: CRUD for manual_blocks and allowlist (single-IP and CIDR, v4 + v6); CidrEvaluator (in-process containment over a snapshot); CidrEvaluatorFactory (60s TTL cache + invalidate on writes); EffectiveStatusService (allowlist + manual; score+policy lands in M07); SPEC §M06 acceptance script passes end-to-end.

Notes for next milestone:

  • M07 wires CidrEvaluatorFactory into the distribution endpoint and finishes EffectiveStatusService by adding score-vs-policy evaluation. Inject CategoryRepository, IpScoreRepository, and the per-policy thresholds into the service alongside the existing evaluator.
  • Cache TTL is CIDR_EVALUATOR_TTL_SECONDS (default 60s); mutation endpoints invalidate explicitly and force a synchronous rebuild (get()) so an overlap WARNING fires inside the same request — operators see immediate feedback. Multi-replica deployments will see up to 60s of staleness across replicas — accepted.
  • Manual-block expiration cleanup: data model has expires_at, repo has findExpired($now) returning ids, but no job runs. Add in M14 hardening if desired, or leave as a documented limitation.
  • CIDR canonicalization picks recommendation (c) from the milestone doc: non-canonical input is silently normalized; the response body echoes normalized_from: <original> only when the normalization changed the input. Canonical input omits the field.
  • Repository inserts go through RepositoryBase::insertRow() for the binary-column ergonomics, but insertRow() returns executeStatement()'s row count — not the new id. The repos call (int) $this->connection()->lastInsertId() after insertRow() to recover the id. Same pattern ReportRepository::insert uses — kept consistent.
  • Cidr::fromBinary($networkBin, $unifiedPrefix) was added so repositories can hydrate stored rows back into the value object. The v4-vs-v6 heuristic mirrors what IpAddress::fromBinary does (v4-mapped IPv6 prefix + unified prefix ≥ 96 ⇒ render as v4).
  • CidrEvaluatorFactory is intentionally not finalEffectiveStatusServiceTest substitutes an in-memory stub via subclass to avoid spinning up the DB.
  • RBAC split per SPEC §6: list/show ⇒ Viewer, create/delete ⇒ Operator. Achieved with per-route RbacMiddleware::require(...) rather than group-level — a small departure from the all-Admin pattern used by reporters/consumers/tokens but the cleanest expression of "the same URL has different role requirements per method".

Deviations from SPEC: none. Added dependencies: none.

M07 — Policies & distribution (done)

Built: policy CRUD with thresholds (replaces wholesale on PATCH); GET /api/v1/blocklist (text/plain + JSON) with ETag/If-None-Match round-trip; per-policy in-memory cache (30s TTL, invalidated on relevant mutations); BlocklistBuilder with allowlist filtering, manual-block dedup (broader CIDR wins), v4-then-v6 stable sort; per-policy preview endpoint; perf test 50k entries <500 ms (SQLite + JIT).

Notes for next milestone:

  • Per-policy cache TTL = 30 s (BLOCKLIST_CACHE_TTL_SECONDS). Mutation endpoints invalidate explicitly: policy CRUD calls BlocklistCache::invalidate($policyId); manual_blocks / allowlist mutations call invalidateAll() (any policy might include manual blocks). Multi-replica deployments will see up to 30 s of cross-replica staleness — accepted, mirrors CidrEvaluatorFactory semantics.
  • The text/plain format is universal (one IP/CIDR per line, no comments). Firewall-specific consumers transform on their side; M13 ships examples in examples/consumers/.
  • DELETE on a policy with referencing consumers returns 409 with {"error":"policy_in_use","consumers":[{id,name},...]}. Cascade is wrong here per SPEC §M07.2.
  • Dedup rule: scored single-IPs covered by a manual subnet are dropped (the broader subnet entry covers them). For same-IP overlap (scored single AND manual single), the scored entry wins to keep category attribution.
  • Allowlist precedence: a manual subnet whose network address sits inside an allowlisted IP/subnet is dropped from the output. Manual single IPs on the allowlist are filtered too. The CidrEvaluator already logs a WARNING when the two lists overlap.
  • ETag stability: SHA-256 over the rendered body (excluding generated_at). Different content-types yield different ETags by design (text vs JSON have different bodies).
  • If-None-Match parsing handles weak validators (W/"…") and the wildcard *.
  • Policies controller's PATCH replaces the threshold set wholesale inside a single transaction (PolicyRepository::replaceThresholds — DELETE then INSERT). Field-level edits to name/description/include_manual_blocks happen alongside in the same request when present.
  • Threshold body shape: {<category_slug>: <number>}; the controller resolves slugs to category ids. Unknown slug returns a 400 with the offending slug in the error message.
  • BlocklistBuilder exposes the build via BlocklistCache::getOrBuild($policy); the public endpoint never builds directly. Preview endpoint bypasses the cache (calls the builder directly) so the UI sees fresh numbers after edits.
  • IpScoreRepository::findExceedingThresholds returns raw associative-array rows (not typed) — the BlocklistBuilder's hot loop casts on demand. Saves ~25 % off the perf budget at 50k rows.

Performance:

  • SPEC §M07.5 budget: 50k entries < 500 ms. Measured warm path on SQLite + opcache JIT (matches production FrankenPHP): 440–460 ms across 5 consecutive runs (median ~444 ms).
  • Without JIT (raw vendor/bin/phpunit --group perf) the same workload takes ~530 ms. The composer test-perf script enables JIT (-d opcache.enable_cli=1 -d opcache.jit_buffer_size=64M -d opcache.jit=tracing) so CI matches the production runtime.
  • Three key optimisations beat the budget: (a) subnets indexed by prefix length so containment is applyMaskFast + isset() rather than per-pair Cidr::contains(); (b) ksort on binary keys (one per family) instead of usort with a closure — closure dispatch dominates at 50k entries; (c) parallel hashes (ipText, categoriesByIp, maxScoreByIp) keyed on ip_bin instead of nested [] rows, so the row-merge loop avoids the per-iteration nested-array allocation.
  • MySQL number not yet measured — to be captured separately when the MySQL CI lane is wired up.

Schema: none — uses the M02 policies and policy_category_thresholds tables as-is.

Test surface added: tests/Unit/Reputation/PolicyEvaluatorTest.php, tests/Integration/Admin/PoliciesControllerTest.php, tests/Integration/Public/BlocklistControllerTest.php, tests/Integration/Reputation/BlocklistBuilderTest.php, tests/Integration/Perf/BlocklistPerfTest.php. Total +28 tests / +95 assertions; perf test excluded from default run via #[Group('perf')]. Suite passes 271 tests / 723 assertions, 0 deprecations.

Acceptance script: ran end-to-end against compose stack. Empty blocklist → 200 with empty body; manual block emits as CIDR; JSON format returns reason="manual"; ETag round-trip returns 304; admin token rejected with 401; preview endpoint returns count + sample for all three seeded policies.

Deviations from SPEC:

  • The migrate container's entrypoint runs Phinx migrations only; SPEC §10 says it should also run seeds. Pre-existing from M01, surfaced again here because M07's acceptance flow depends on the seeded policies. Worked around for the smoke test by running vendor/bin/phinx seed:run against the started container. Flagged for M13 polish (or earlier if another milestone is bitten by it).
  • composer test script now passes --exclude-group perf so the default suite is fast; perf is run via composer test-perf with JIT enabled to match production.
  • The PHPUnit doc-comment @group annotation was switched to the #[Group('perf')] attribute to silence a PHPUnit-12 deprecation warning.

Added dependencies: none.

M08 — UI scaffold & auth (done)

Built: UI base (Slim+Twig+Tailwind+Alpine+htmx with dark-mode FOUC-free toggle and sidebar/topnav), session manager (file-backed PHP sessions, 8h idle / 24h absolute), CSRF middleware (constant-time compare, header + form-field), ApiClient with auto Bearer + X-Acting-User-Id + retry-once-on-5xx + typed exceptions, AuthClient + AdminClient subclients, OIDC code-flow with PKCE via jumbojett/openid-connect-php, local admin login (Argon2id + 5-fail/30s session-scoped throttle), CSRF-protected logout, /app/me page, /no-access page, friendly Twig error template, doc/oidc.md Entra setup guide.

Notes for next milestone:

  • AdminClient exposes only getMe(); M09 adds methods for IP search/detail, dashboard, etc.
  • ApiClient exception types are stable: ApiAuthException (401/403), ApiValidationException (400/422 with details array), ApiNotFoundException (404), ApiServerException (5xx), ApiUnreachableException (network/timeout). M09+ catches them in controllers.
  • Sidebar nav placeholders show their target milestone (M09/M10/M12). Replace each placeholder with a real link as the section ships.
  • Dark mode persistence: localStorage.irdb-theme = 'dark' | 'light'. Inline <head> script reads it before paint; the toggle button ([data-theme-toggle]) writes it. System preference is the first-visit default via prefers-color-scheme.
  • M14 will replace the basic 5/30 session-scoped throttle with a real brute-force lockout (IP-keyed, persistent).
  • regenerateId() is called after every auth-state change (login success, logout) per SPEC §M08; defeats session fixation.
  • POST handlers (/login/local, /logout) return 303 See Other on redirect so curl/browsers switch to GET. GET handlers stay at 302.
  • Service-token + impersonation header invariant: every API call out of the UI carries Authorization: Bearer <UI_SERVICE_TOKEN> and (when a session user exists) X-Acting-User-Id: <user_id>. Auth endpoints (/api/v1/auth/*) are called WITHOUT the impersonation header by design — they exist to produce the user record we'd be impersonating.
  • Config::validateOrExit() runs at boot and exits non-zero if UI_SERVICE_TOKEN/API_BASE_URL are missing or both auth methods are disabled. This means a misconfigured deployment crashes on docker compose up, not on the first user click.
  • OidcAuthenticator is an interface (concrete impl JumbojettOidcAuthenticator); tests stub it to drive the OIDC controller's branches without a real IdP.
  • Healthz remembers the most recent ApiClient call result via ApiHealth singleton; both fields are null until the UI has made its first API call.
  • htmx config picks up the per-session CSRF token from the <meta name="csrf-token"> tag and sends X-CSRF-Token automatically — established now even though htmx isn't used yet, so M09 tables can rely on it.

Manual verification:

  • Acceptance script ran end-to-end against compose stack with LOCAL_ADMIN_ENABLED=true: login page renders with "Sign in" button, /healthz returns 200, /app/me unauthed redirects to /login, GET /login → set CSRF + session cookie, POST /login/local with valid creds → 303 to /app/me showing "admin" role + "local" source, wrong password rejected with flash, missing CSRF token → 403, logout clears session and post-logout /app/me → 302 again.
  • OIDC flow against a live tenant: NOT manually verified in this environment — no Entra tenant available. The flow is covered by integration tests with a stubbed OidcAuthenticator (success path, no-role/no-access path, handshake failure, api-down during upsert). Real-tenant verification deferred to the next operator with tenant access; doc/oidc.md documents the setup.

Operational gotchas observed:

  • Docker Compose's .env file performs variable substitution on values, so an Argon2id hash containing $argon2id$v=19$… collapses unless every $ is doubled to $$. Documented inline in the .env.example instructions for M13 polish.
  • curl -L together with -X POST does NOT switch the method to GET on a 303 response (the explicit -X overrides curl's default). Acceptance scripts and any curl-based tests should use -d alone (which implies POST and lets curl follow redirects with GET).

Test surface added (ui): 47 tests / 102 assertions.

  • Unit: ApiClientTest (status-code mapping, retry-once, header injection, health tracking), SessionManagerTest (set/get/clear, flash, throttle, lockout), CsrfMiddlewareTest (skip on GET, header + form-field paths, 403 on missing/wrong token).
  • Integration: LocalLoginTest (form render, success, wrong password, wrong username, missing CSRF, lockout after 5 fails, api-down handling), LogoutTest (success + missing-CSRF), OidcFlowTest (success, no-role → /no-access, handshake failure, api-down), RoutesTest (home redirect, healthz, /app/me gating, /no-access).

Deviations from SPEC:

  • The TwigGlobalsMiddleware runs ahead of AuthRequiredMiddleware so anonymous /app/* requests still have csrf_token / flash / current_user globals available for the redirect-to-/login response — minor implementation detail, no functional difference.
  • POST handlers return 303 instead of 302 (SPEC says "redirect"; standardised on 303 for state-changing redirects to avoid the curl -X POST -L resubmit-method behaviour).

Added dependencies:

  • esbuild (devDependency, JS bundling for app.js). The SPEC §2 doesn't enumerate a JS bundler explicitly but allows "vanilla JS + Alpine.js + htmx where it simplifies forms"; the Tailwind-only build was insufficient since Alpine and htmx are imported modules. The Dockerfile build now runs both tailwindcss and esbuild.
  • jumbojett/openid-connect-php was already in SPEC §2 / composer.json; it's just being USED for the first time in M08.

M09 — UI: IPs, history, dashboard (done)

Built: read-only IP browsing UI + matching admin endpoints. API: GET /api/v1/admin/ips (paginated search with q / category / score range / country / asn / status filters), GET /api/v1/admin/ips/{ip} (scores per category, enrichment placeholder, manual/allowlist panels, 200-entry history timeline with has_more), GET /api/v1/admin/stats/dashboard (active blocks + counters + 24h histogram + top reporters/categories + jobs status, 30s in-memory cache). UI: /app/dashboard, /app/ips, /app/ips/{ip}. Chart.js bundles via esbuild (tree-shaken to ~150kb of bar/linear pieces). Default post-login redirect now /app/dashboard. Sidebar highlights the active section.

Notes for next milestone:

  • EffectiveStatusService was completed: it now distinguishes Scored (any non-zero score in ip_scores) from Clean (no rows or all zero). M07's policy-vs-score evaluation lives separately in PolicyEvaluator — the single-IP "is this scored?" question is policy-agnostic by design (you don't get to ask "scored against which policy?" when looking at one IP).
  • Dashboard active_blocks is an approximation: distinct IPs in ip_scores with score > 0 PLUS single-IP manual_blocks. Computing the exact count of IPs in the seeded moderate policy's blocklist would require running BlocklistBuilder per request, which is too expensive for a 30s-cached dashboard. The number is a stable proxy; the response carries reference_policy: "moderate" to make the caveat explicit. M10/M12 may add a config knob.
  • IP search is grounded in ip_scores — IPs that are only manually blocked (no reports yet) won't appear unless they have an ip_scores row. Manual subnets and allowlist subnets aren't expanded in search either; only single-IP entries from those tables intersect with the search via the status filter. The IP detail page shows the precise effective status. Documented limitation; a richer "show subnet members" view is out of scope.
  • Country flag rendering uses the regional-indicator emoji pair ('🇦' ~ first ~ '🇦' ~ second). Browsers without flag-emoji fonts (some Windows configs) render it as block letters; the fallback when country_code is null is a ?? pill.
  • Country / ASN columns are blank until M11 wires real GeoIP (the ip_enrichment table exists; the enrich-pending job is still a skeleton).
  • Manual-block / allowlist mutation buttons on the IP detail page are deliberately absent here. M10 adds them.
  • IpHistoryRepository UNIONs the three sources in PHP (separate queries → merge → sort) rather than in SQL — the per-source caps (500 reports max + small manual/allowlist tables) keep this fast at our dataset sizes; switching to a single SQL UNION ALL is straightforward if profiling later shows it matters.
  • Lighthouse not measured here — the acceptance environment has no headless browser. The pages use semantic HTML (<table> with proper <thead>/<tbody>, labelled form inputs, aria-current="page" on the active sidebar link, contrast tokens that pass WCAG AA in both modes by Tailwind's defaults). M13 will run the actual Lighthouse pass.
  • Slim's default segment regex disallows colons; /{ip:.+} is required for IPv6 paths to route on both the api and the ui.

Schema:

  • idx_ip_scores_ip_text — single-column index on ip_scores.ip_text so the search's LIKE 'prefix%' path doesn't full-scan. LIKE '%substr%' falls back to a scan, acceptable at the dataset sizes covered by the SPEC.

Test surface added (api): tests/Integration/Admin/IpsControllerTest.php (10 tests covering list ordering, prefix filter, category filter, pagination, validation error, detail success/empty/404, enrichment-null-by-default), tests/Integration/Admin/StatsControllerTest.php (3 tests covering empty shape, non-empty counters, manual/allowlist counters). Updated EffectiveStatusServiceTest to cover the new Scored-when-rows-exist branch with stub IpScoreRepository. Total: 285 tests / 787 assertions.

Test surface added (ui): tests/Integration/App/DashboardPageTest.php (renders stats + chart canvas + degrades on api-down), tests/Integration/App/IpsPageTest.php (list, empty state, filter round-trip via form, detail page with scores+history, twig 404, anonymous redirect). Total: 55 tests / 133 assertions.

Acceptance script: ran end-to-end against compose stack. Seeded 3 reports across 3 IPs (mix of v4 + v6); local admin login → /app/dashboard renders with "reports", "Active blocks", "test-reporter", and the chart canvas; /app/ips lists all three IPs with brute_force as top category; ?q=2001 narrows to the v6 IP only; /app/ips/203.0.113.10 shows "Score per category" and "History" sections with brute_force; /app/ips/not-an-ip returns 404 with the friendly error template.

Deviations from SPEC:

  • EffectiveStatusService had an unused-but-final wiring left over from M06; making it usable for M09 required a constructor change (+IpScoreRepository) and broke the Unit-level test that constructed it with one arg. Updated tests accordingly. The fix also required dropping IpScoreRepository's final modifier so test stubs can extend it — same pattern used for CidrEvaluatorFactory in M06.
  • The dashboard active_blocks figure is an approximation (see notes above), not the exact "moderate-policy blocklist size" the SPEC mentions. The response carries reference_policy: "moderate" to call this out and makes a follow-up "switch to exact computation" trivial when M12 or beyond decides it's worth the cost.
  • Sidebar's "My identity" link moved to the bottom of the nav (under the M10/M12 placeholders) since /app/dashboard is now the canonical landing page. Visual order only; no functional change.

Added dependencies:

  • chart.js (npm dep). The SPEC's M09 doc explicitly allows it; tree-shaken to bar/linear/category controllers + element + tooltip + title via Chart.js's modular registration, keeping the impact at ~150kb of the final ~263kb bundle (Chart.js + Alpine + htmx + our own ~3kb).

M10 — UI admin CRUD (done)

Built: every admin CRUD UI (manual blocks, allowlist, policies with threshold matrix, reporters, consumers, tokens with one-time raw-token modal, categories with SVG decay-curve preview), plus IP-detail action buttons (allowlist/manual-block add+remove). Categories CRUD on the api with in-use refusal.

API surface added: GET/POST/PATCH/DELETE /api/v1/admin/categories. Slug is kebab-ish (^[a-z][a-z0-9_]{0,63}$) + unique. DELETE returns 409 with {usage: {policies, reports}, hint} when references exist; the UI surfaces the hint via flash, and the edit page exposes an is_active=false checkbox for soft-delete.

Notes for next milestone:

  • AdminClient covers every admin endpoint shipped so far. M11 only needs to extend it for enrichment data shape; M12 adds audit-log + jobs trigger.
  • User management UI (/app/users, role-mapping editor) — deferred to M12 alongside the audit page; the api endpoints exist but no UI was built. Sidebar still hides the link until then.
  • Token list never includes service tokens (the api filters them out unconditionally; the bootstrap is the only producer).
  • RBAC summary applied this milestone: viewer reads everything; operator may manage manual blocks + allowlist; admin owns tokens, policies, categories, reporters, consumers. UI hides buttons for roles that can't use them; the api enforces the actual gate. Friendly error pages render on direct-URL forbidden access via the existing JsonExceptionHandler.
  • Token creation flow uses a session slot (_token_just_created) instead of a query string — POST → 303 → GET reads-and-clears the slot, so refreshing the page after the modal is dismissed never re-shows the raw token (one-time-display invariant).
  • Policy threshold editor renders rows for all categories; empty input means "not in policy" (not "threshold = 0"). The PATCH endpoint replaces the threshold set wholesale, so unchecked rows simply don't appear in the body.
  • Decay-curve preview is pure client-side Alpine (~30 lines) — Decay::value ported to JS. No charting lib for one curve.
  • Live policy preview calls a UI-side proxy (/app/policies/{id}/preview-proxy) rather than hitting the api directly from the browser; the browser doesn't have the service token, so the proxy bridges the BFF gap. Returns the api's preview JSON verbatim.
  • POST handlers consistently return 303 See Other so curl/browsers follow with GET (M08 lesson). Form-encoded next field threads through delete actions on the IP detail page so the user lands back on the IP they were viewing rather than the manual-blocks list.
  • All destructive actions go through a small partials/confirm_form.twig Alpine modal — single source of truth for the cancel/confirm UX. Reused by every list page.
  • Decision: /app/manual-blocks POST always redirects back to /app/manual-blocks (never to /app/subnets), even when the created entry is a subnet. Stable destination keeps tests + browser behaviour predictable; the user can navigate via the kind filter.

Schema: none.

Test surface added (api): tests/Integration/Admin/CategoriesControllerTest.php (9 tests covering list, validation, create+show, duplicate-slug refusal, PATCH, in-use 409 from policy refs, in-use 409 from report refs, hard-delete success, RBAC). Total: 294 tests / 821 assertions.

Test surface added (ui): tests/Integration/Crud/CrudPagesTest.php (16 tests: each list page renders + a happy/validation pair per resource + the token one-time-display flow + IP-detail RBAC button visibility for operator vs viewer). Total: 71 tests / 177 assertions.

Acceptance script: ran end-to-end against compose stack:

  • All seven list pages return 200 for a logged-in admin.
  • POST /app/manual-blocks creates the subnet (verified by the api list endpoint; the milestone-doc grep used a non-escaped / and got tripped by JSON's \/ escaping — data was actually present).
  • Operator-role admin token gets 403 on DELETE /api/v1/admin/tokens/{id} (api enforces).
  • Token-creation modal flow: POST returns 303, follow-up GET surfaces the irdb_adm_<32 base32> raw token in the page body and discards the slot afterward.
  • Categories: created phishing, attached to a policy, then DELETE returns 409 (api refuses with category_in_use).

Deviations from SPEC:

  • User management UI (admin/users + role-mapping editor) deferred to M12 alongside the audit page. Documented under "Notes for next milestone" so M12 owners pick it up.
  • IpScoreRepository::final was already dropped in M09 to allow a test stub; no further class-modifier changes this milestone.

Added dependencies: none.

M11 — Enrichment (done)

Built: MMDB wrapper, three pluggable downloaders (DB-IP / MaxMind / IPinfo), both jobs (enrich-pending fully implemented; refresh-geoip replacing the M05 stub), UI display + provider attribution, healthz fields, country dropdown source.

Notes for next milestone:

  • DBs live at /data/geoip/{country,asn}.mmdb (renamed from SPEC §9 defaults to be provider-agnostic; documented in .env.example).
  • Default provider is DB-IP — no credential required, never returns 412.
  • MaxMind and IPinfo paths return 412 when their credential is empty (controller short-circuits before lock acquire so the job_runs lock isn't dirtied).
  • License key / IPinfo token never logged: error messages substitute *** for the real value before throwing DownloaderException.
  • Re-enrichment is opt-in via ?reenrich=true on refresh-geoip. The flag clears enriched_at after a successful refresh so findPending re-picks the rows up on the next enrich-pending tick.
  • DB-IP and IPinfo: no upstream integrity file; verification is gzip-decode (DB-IP only) + MMDB metadata + node-count sanity (MmdbVerifier). MaxMind keeps SHA-256.
  • Attribution rendered in UI for DB-IP and IPinfo per their license terms; MaxMind requires no attribution. The provider name flows from GEOIP_PROVIDER through the UI's settings into a Twig global, so the detail page picks the right footer.
  • /admin/ips/countries returns [{code, count}] sorted by code; the IPs-list page renders a dropdown when the list is non-empty, falls back to the free-text input otherwise (so empty installs still let you type a code).
  • New dry_run=1 query flag on POST /internal/jobs/refresh-geoip returns 202 with provider + dry_run: true without taking the lock — used by the acceptance script to confirm the controller doesn't 412 under DB-IP.
  • MmdbEnrichmentService::isReady() is the fast-path check the EnrichPendingJob uses to no-op cleanly when neither DB is on disk yet — avoids running 200 per-tick lookups that all return empty.
  • Atomic file replace: tempnam + write + rename within the same target dir so POSIX rename is atomic. tempnam creates 0600; we relax to 0644 so other procs can open the new file.

Schema: none. The existing ip_enrichment table from M02 took the new write paths verbatim; no migration needed.

Test surface added (api): Unit: MaxMindRecordAdapterTest, IpinfoRecordAdapterTest, MmdbEnrichmentServiceTest (drives the vendored MaxMind test fixtures end-to-end including v6 lookup and missing-DB warn-once). Integration: EnrichPendingJobTest (4 tests covering happy path, no-op-on-missing-DB, idempotence, ?reenrich loop), CountriesEndpointTest (empty + populated + RBAC). JobsEndpointsTest updated: replaced the "M05 returns 412 not_implemented" assertion with three new ones: dry-run under DB-IP doesn't 412, MaxMind without key returns 412 + provider/missing fields, IPinfo without token same. Total: 316 tests / 882 assertions, 0 deprecations, 0 failures, 0 errors.

Test surface added (ui): Updated existing IpsPageTest to enqueue the extra listCountries API call the controller now issues alongside searchIps. No new test file. Total: 71 tests / 177 assertions, 0 failures.

Acceptance: composer cs && composer stan && composer test clean on both subprojects. The full Block A/B/C acceptance script in the M11 brief is gated on a fresh docker compose boot, which the development environment in this session can't run end-to-end (no Docker daemon at the host level during the milestone implementation phase). The unit + integration tests cover every controller and job code path that the bash acceptance script exercises; the bash script is preserved verbatim for the next operator to run against a clean compose stack.

Deviations from SPEC:

  • SPEC §9 names GEOIP_COUNTRY_DB=/data/geoip/GeoLite2-Country.mmdb; renamed to /data/geoip/country.mmdb (and asn likewise) so the runtime paths are provider-agnostic. Documented inline in .env.example.
  • SPEC §2 names MaxMind GeoLite2 specifically; MaxMind stays a first-class provider but the default for new installs is DB-IP (also MMDB, CC BY 4.0) for friction-free self-hosting. The ADR sits in this PROGRESS entry and the milestone brief.
  • The MMDB lookup uses MaxMind\Db\Reader::get($ip) directly rather than the higher-level Geoip2\Database\Reader::country() accessor — the latter is shape-specific and breaks on IPinfo's flat record schema. Per the milestone brief.
  • JobsController::refreshGeoip accepts ?dry_run=1 (returns 202 without taking the lock or running the job). Adds the only public surface change beyond the spec: the brief's acceptance script needs a way to confirm "controller doesn't 412 under DB-IP" without pulling 100 MB of MMDBs over the wire in CI.

Added dependencies: geoip2/geoip2 (named in SPEC §2 as the planned package; we use its underlying MaxMind\Db\Reader for cross-provider support), guzzlehttp/guzzle (named in SPEC §2 — first time used in api; the ui already had it). maxmind-db/reader and maxmind/web-service-common came in transitively.

Added env vars: GEOIP_PROVIDER (default dbip; values dbip|maxmind|ipinfo), IPINFO_TOKEN (used only when provider=ipinfo). MAXMIND_LICENSE_KEY was already in .env.example.

Added test fixtures: api/tests/Fixtures/geoip/{country,asn}.mmdb vendored from maxmind/MaxMind-DB (Apache-2.0). Cover IP 81.2.69.142 (GB) plus a small IPv6 set. Schema is MaxMind-shape so MaxMindRecordAdapter drives them; the IPinfo adapter is exercised via direct unit tests since no public IPinfo-shape MMDB fixture is available.

M12 — Audit & settings (done)

Built: audit emission across every state-changing admin endpoint; filterable audit list endpoint + UI; admin-side jobs status + manual trigger endpoints + UI Settings page; effective-config endpoint with secrets masked.

Notes for next milestone:

  • Audit failures are logged (audit_emit_failed Monolog event) but never propagate — DbAuditEmitter swallows on insert error.
  • The actor-resolution invariant: service-token + impersonation always records actor_kind=user with the impersonated user_id; raw admin tokens record actor_kind=admin-token with the token id as actor_id. Reporter / consumer tokens are recorded with their FK id.
  • Failed validation paths (4xx) don't emit audit. Only successful state changes do.
  • POST /api/v1/admin/jobs/trigger/{name} is the only path the UI uses to invoke jobs; /internal/jobs/* remains scheduler-only and network-restricted to RFC1918. The admin endpoint emits one job.triggered audit row before invoking the runner with triggered_by="manual".
  • Manual trigger short-circuits 412 for refresh-geoip when an opt-in provider's credential is unset — same envelope the internal handler uses, so the UI flash message reads identically.
  • Whitelisted job-trigger params: full, max_rows, reenrich. Anything else in the request body is dropped to avoid a malicious admin smuggling config-shaped values into the runner.
  • Token creation NEVER puts the raw token in the audit payload — prefix only. Verified by an integration test that asserts the raw token doesn't appear in details_json.
  • GET /admin/config masks INTERNAL_JOB_TOKEN, MAXMIND_LICENSE_KEY, IPINFO_TOKEN, DB_MYSQL_PASSWORD, APP_SECRET to ***; UI_SERVICE_TOKEN shows the first 8 chars
    • .... Plain values for everything else (DB driver, log level, cadences, GeoIP paths). Empty values stay empty so misconfiguration is visible instead of being hidden behind ***.

Schema:

  • idx_audit_action — index on audit_log(action) for the audit page's filter-by-action common case. Country/actor/entity-id indexes were already in M02. Migration 20260429140000_add_audit_action_index.php.

Test surface added (api): Unit: AuditActionTest. Integration: AuditEmissionTest (5 tests covering admin-token attribution, service-token impersonation attribution, raw-token-not-in-payload, no-emit-on-validation-failure, full create/update/delete cycle for categories), AuditLogControllerTest (4 tests: empty, filtered, invalid-actor-kind 400, RBAC), JobsAdminControllerTest (5 tests: viewer-readable status, operator forbidden trigger, unknown job 404, manual trigger end-to-end with audit + triggered_by=manual, refresh-geoip 412 under MaxMind without key), ConfigControllerTest (viewer 403, sections shape, masking with secrets set). Total: 336 tests / 973 assertions, 0 deprecations.

Test surface added (ui): AuditPageTest (4 tests: list, empty, filter round-trip, anonymous redirect), SettingsPageTest (3 tests: admin renders config + jobs, viewer 303 to /no-access, anonymous to /login). Total: 78 tests / 199 assertions.

Acceptance: composer cs && composer stan && composer test clean on both subprojects. The full Block A/B/C bash acceptance script in the M12 brief is gated on a fresh docker compose boot, which the development environment in this session can't run end-to-end (no Docker daemon at the host level during the milestone implementation phase). The unit + integration tests cover every controller and audit code path that the bash acceptance script exercises; the bash script is preserved verbatim in the milestone doc for the next operator to run against a clean compose stack.

RBAC summary applied this milestone:

  • Audit list: Viewer (every signed-in user can browse audit).
  • Jobs status: Viewer (cosmetic Settings rendering still gates on Admin).
  • Job trigger: Admin only.
  • Config endpoint: Admin only.
  • All emission middleware runs once per admin request (after TokenAuth + Impersonation, before RBAC) so the actor is always resolved regardless of which middleware short-circuits.

Deviations from SPEC:

  • The SPEC §M12.6 user management UI was deferred from M10 and remains out-of-scope here (the focus this milestone was the audit trail itself). API endpoints for users / oidc-role-mappings exist from M03; their dedicated UI list/edit pages will land in M13/M14. Sidebar still hides those links until they ship.
  • audit_log.target_type and target_id are the SPEC §4 column names; the API and UI surface them under the brief's vocabulary entity_type/entity_id for clarity. The repository translates. This is documented inline; no schema change.
  • JobsAdminController is a separate class from the internal JobsController; the brief implied a single shared class but a dedicated admin controller keeps the audit + RBAC concerns cleanly out of the internal-only handler.

Added dependencies: none.

Added env vars: none.

M13 — Polish, OpenAPI, docs (done)

Built: OpenAPI document at /api/v1/openapi.yaml + RapiDoc viewer at /api/docs; new README with quickstart and operational guides; all five doc/*.md files per SPEC §16; examples/ with reporter scripts (curl, python, fail2ban shim) + consumer scripts (iptables/ipset swap, nginx deny-include, HAProxy ACL with runtime-API path) + scheduler (host crontab, systemd unit + timer) + reverse-proxy Caddyfile; tests/e2e/demo.sh end-to-end smoke check; scripts/check-doc-endpoints.sh doc-accuracy CI guard.

Notes for next milestone:

  • OpenAPI generation is hand-curated: source of truth is api/openapi.php, generated YAML lives at api/public/openapi.yaml, build via composer openapi:build. Trade-off recorded inline at the top of api/openapi.php. The recommended approach if drift becomes a problem is to switch to zircote/swagger-php annotations on the controllers — adding a dep and ~100 LOC of attributes, but turning the spec into a test of the controllers.
  • Doc-accuracy CI guard: ./scripts/check-doc-endpoints.sh. Run after editing either side. Templatizes literal IDs, IPs, and category slugs so /api/v1/admin/ips/{ip} in the spec covers /api/v1/admin/ips/203.0.113.42 in prose. Allowlist for known doc-only paths (e.g. /api/v1/openapi.yaml, the /api/v1/auth/oauth/* future-flow sketch) is at the top of the script.
  • examples/ scripts use the IRDB_URL and IRDB_TOKEN env vars uniformly; this is the convention for the operator-facing tooling. All shell scripts are shellcheck-clean.
  • The "future user-token flow" in doc/auth-flows.md (§ "Future: direct user tokens") is the recommended extension point for SPA / native / mobile UIs that don't want a BFF. Marked NOT IMPLEMENTED; not in OpenAPI; allowlisted in the doc-endpoint checker.
  • doc/oidc.md (created in M08) was deleted — its content is now in doc/auth-flows.md § "Entra setup walkthrough" with troubleshooting.
  • The tests/e2e/demo.sh runtime is gated on Docker availability; it's documented as the operator-driven smoke check, not run as part of composer test. CI integration is left for M14.

RapiDoc choice: the viewer at /api/docs is RapiDoc 9.3.4 loaded from cdn.jsdelivr.net (locked to a specific version). Smaller than Stoplight Elements, supports try-it-now and authentication, dark theme matches the rest of the UI palette. CDN is the only failure mode — an offline deployment would fail the viewer, but the YAML itself remains available at /api/v1/openapi.yaml for local tools.

OpenAPI scope:

  • All Public endpoints (/api/v1/report, /api/v1/blocklist).
  • All Admin endpoints (/api/v1/admin/*), including audit, jobs status/trigger, and config.
  • Auth endpoints (/api/v1/auth/*) marked x-internal: true with a "UI BFF only" description.
  • /internal/jobs/* deliberately omitted per SPEC §M13.1 — private contract, scheduler-only.

The committed api/public/openapi.yaml validates clean against redocly/cli lint: 0 errors, 89 warnings (all stylistic — missing example blocks on schema fields, no info.license.url, etc.). The warnings are deferred to a future polish pass.

Schema: none. The doc-accuracy guard added a new shell script under scripts/; no DB changes.

Tests: 336 api / 78 ui pass; cs + stan clean on both. The e2e/demo.sh is shell-tested but not exercised end-to-end here (no Docker daemon at the host level during the milestone implementation phase). It's preserved for the next operator with a clean compose stack.

Deviations from SPEC:

  • M13.7 mentions a CI job for openapi validation and doc-endpoint accuracy. The scripts/check-doc-endpoints.sh script is in place; the openapi-lint and demo.sh aren't yet wired into scripts/ci.sh or a GitHub Actions workflow. CI integration deferred to M14 hardening.
  • The oidc-role-mappings admin REST endpoints aren't built (M03 owns the schema/repository; the admin-UI for editing the table is the M14 hardening item). Documentation no longer references the REST path; both the README and doc/auth-flows.md describe the SQL path with a note that the UI is forthcoming.

Added dependencies: none.

Added env vars: none.