# M07 — Policies & Distribution API > Fresh Claude Code agent prompt. M06 must be complete and committed. > Estimated effort: medium. ## Mission Implement policy CRUD, the policy-vs-score evaluator, the public `GET /api/v1/blocklist` endpoint with caching/ETag/text-and-JSON formats, and a per-policy preview endpoint for the UI. By the end, three different policies produce three different blocklists from identical underlying data, and the endpoint serves 50k entries in <500 ms. ## Before you start 1. Verify M06: ```bash git log --oneline -6 cd api && composer test && composer stan && cd .. ``` 2. Read `SPEC.md` §4 (`policies`, `policy_category_thresholds`), §5 (output rule for an IP appearing on a policy's blocklist), §6 (Public API: `/api/v1/blocklist`; Admin API: policies + preview). 3. Confirm the seed policies from M02 exist with sensible thresholds. ## Tasks ### 1. Policy domain In `api/src/Domain/Policy/`: - `Policy.php` — value object: `id`, `name`, `description`, `includeManualBlocks`, `thresholds: array` (categoryId => threshold). - `PolicyEvaluator.php`: - Constructor takes a `Policy` and the current `CidrEvaluator` from M06. - `evaluate(IpAddress $ip, array $scoresByCategory): EvaluationResult` — returns one of: `EXCLUDED_BY_ALLOWLIST`, `INCLUDED_BY_MANUAL_BLOCK`, `INCLUDED_BY_SCORE` (with the matching categories), or `EXCLUDED`. - The score-side rule: an IP is included if **any** category in the policy meets its threshold. `policy_category_thresholds` rows define inclusion; absent rows mean "this category is ignored by this policy." In `api/src/Infrastructure/Db/PolicyRepository.php`: - CRUD over `policies` and `policy_category_thresholds` (the join is small; load thresholds eagerly with each policy). - `byName(string): ?Policy`, `byId(int): ?Policy`. - Concurrent threshold updates: replace all thresholds for a policy in a single transaction. ### 2. Admin endpoints In `api/src/Application/Admin/PoliciesController.php`: - `GET /api/v1/admin/policies` - `GET /api/v1/admin/policies/{id}` — includes thresholds. - `POST /api/v1/admin/policies` — body `{name, description, include_manual_blocks, thresholds: {: }}`. - `PATCH /api/v1/admin/policies/{id}` — same body shape; replaces thresholds wholesale. - `DELETE /api/v1/admin/policies/{id}` — refuse if any consumer references this policy (409 with `{"error":"policy_in_use","consumers":[...]}`); cascade is wrong here. - `GET /api/v1/admin/policies/{id}/preview` — returns `{count: int, sample: [string], generated_at}`. Sample = first 50 entries. Same calculation as the distribution endpoint. RBAC: `Admin` for write, `Viewer` for read. ### 3. Distribution endpoint In `api/src/Application/Public/BlocklistController.php`: - `GET /api/v1/blocklist` — token must be `kind=consumer`. Resolves the consumer's policy, evaluates, returns the blocklist. - Output formats: - Default: `text/plain`. One entry per line. No comments. Lines are bare IPs (`203.0.113.42`, `2001:db8::1`) or CIDRs (`203.0.113.0/24`, `2001:db8::/32`). - `?format=json`: JSON array of `{ip_or_cidr, categories: [string], score: number|null, reason: "scored"|"manual"}`. Allowlisted IPs never appear in either format. - Headers (both formats): - `ETag`: SHA-256 hex of the response body. Honor `If-None-Match` → `304` with empty body. - `X-Blocklist-Generated-At`: ISO 8601. - `X-Blocklist-Entries`: count. - `X-Blocklist-Policy`: policy name. - Caching: 30-second per-policy in-memory cache (key: `policyId`). Cache invalidation triggers: any mutation to `policies`, `policy_category_thresholds`, `manual_blocks`, `allowlist`, or a manual flag from M12's "rebuild scores" trigger. For simplicity now, just TTL — invalidation hooks into mutations come for free if you respect the same `CidrEvaluator` invalidation pattern from M06. ### 4. Blocklist computation In `api/src/Domain/Reputation/BlocklistBuilder.php`: - `build(Policy $policy): Blocklist` — returns a list of entries with metadata. - Algorithm: 1. Read all `ip_scores` rows joined to categories where the score column meets at least one threshold for this policy. Single SQL query with a UNION across category thresholds, OR a simpler "select all, filter in PHP" if policy has few categories. Pick whichever is faster on a 50k-row dataset; benchmark. 2. Filter out IPs in the allowlist (`CidrEvaluator::isAllowlisted`). 3. If `include_manual_blocks`, append all manual block entries (single IPs and CIDRs), filtering allowlisted ones. 4. Deduplicate (an IP might be both scored and manually blocked). 5. Sort: IPv4 first, then IPv6; lexical within each. Stable order so the ETag is stable. - Returns entries with the exact representation needed for both formats. `Blocklist` value object: a list of `BlocklistEntry { ipOrCidr, isCidr, categories?, score?, reason }`. ### 5. Performance Add a perf test in `api/tests/Integration/Perf/BlocklistPerfTest.php`: - Seed 50k `ip_scores` rows (mixed v4 and v6, varied scores) plus 100 manual subnet blocks. - Time the blocklist build for the `paranoid` policy. - Assert <500 ms wall-clock. - Skip in default test runs (mark `@group perf`); run in CI as a separate job. If you can't hit 500 ms, the bottleneck is almost certainly the SQL query. Options: - Add a covering index on `ip_scores(category_id, score DESC)` so threshold-filter scans are cheap. - Pre-aggregate per-IP "max score across all categories" into a derived column in `ip_scores` (mild denormalization). Out of scope unless 500ms is unreachable; document if you take this route. ## Implementation notes - **Cache vs eviction**: per-policy 30s cache key by `policy_id`. Memory bound: if a deployment has 100 policies × 50k entries × ~50 bytes each, that's ~250 MB. Acceptable for default; flag in PROGRESS.md as a known footprint. - **JSON format**: keep it small. Don't include audit/timestamp fields per entry; that's what the admin API is for. - **Empty blocklist**: 200 with empty body in text mode, `[]` in JSON. Still emit ETag. - **ETag stability**: the ETag must depend only on the data, not on time. Don't include `generated_at` in the body. - **`If-None-Match`**: parse standard format including weak validators (`W/"..."`). Strict comparison on the strong hash is fine. - **Deduplication subtlety**: if an IP is in `ip_scores` AND inside a manually blocked /24, you have two ways to include it (single + subnet). Prefer the broader one (the /24 subnet entry covers the IP); drop the single entry to keep the list compact. - **Subnet expansion**: never expand a /16 to 65k entries. Emit as CIDR. ## Out of scope (DO NOT) - UI changes — M08 onward. - Audit emission — M12. - Format generators for specific firewalls (iptables, nginx, HAProxy). The `text/plain` output is universal; per-firewall transformation is a client-side concern, with examples shipped in M13's `examples/consumers/`. - Compression (gzip) — let FrankenPHP/Caddy handle it via standard headers if needed; don't roll your own. - Streaming responses — buffered text response is fine at 50k entries. - New dependencies. ## Acceptance ```bash cd api && composer cs && composer stan && composer test && cd .. cd api && vendor/bin/phpunit --group perf && cd .. docker compose down -v cp .env.example .env docker compose up -d sleep 15 ADMIN_TOKEN=$(docker compose exec -T api php bin/console auth:create-token --kind=admin --role=admin --quiet) # Create a consumer + token (requires a policy_id; use the seeded "moderate") POLICY_ID=$(curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \ http://localhost:8081/api/v1/admin/policies \ | php -r '$j=json_decode(stream_get_contents(STDIN),true); foreach($j["items"] as $p){if($p["name"]==="moderate"){echo $p["id"];break;}}') CONSUMER=$(curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \ -d "{\"name\":\"firewall-1\",\"description\":\"edge\",\"policy_id\":$POLICY_ID}" \ http://localhost:8081/api/v1/admin/consumers) CONSUMER_ID=$(echo "$CONSUMER" | php -r 'echo json_decode(stream_get_contents(STDIN),true)["id"];') TOKEN_RESP=$(curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \ -d "{\"kind\":\"consumer\",\"consumer_id\":$CONSUMER_ID}" \ http://localhost:8081/api/v1/admin/tokens) CONSUMER_TOKEN=$(echo "$TOKEN_RESP" | php -r 'echo json_decode(stream_get_contents(STDIN),true)["raw_token"];') # Empty blocklist initially curl -s -H "Authorization: Bearer $CONSUMER_TOKEN" http://localhost:8081/api/v1/blocklist # -> empty body, 200 # Insert a manual block; blocklist now contains it curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \ -d '{"kind":"subnet","cidr":"198.51.100.0/24","reason":"x"}' \ http://localhost:8081/api/v1/admin/manual-blocks > /dev/null sleep 1 curl -s -H "Authorization: Bearer $CONSUMER_TOKEN" http://localhost:8081/api/v1/blocklist | grep -q "198.51.100.0/24" # JSON format curl -s -H "Authorization: Bearer $CONSUMER_TOKEN" \ "http://localhost:8081/api/v1/blocklist?format=json" | grep -q '"reason":"manual"' # ETag round-trip ETAG=$(curl -s -D - -H "Authorization: Bearer $CONSUMER_TOKEN" \ http://localhost:8081/api/v1/blocklist -o /dev/null | grep -i '^etag:' | cut -d' ' -f2 | tr -d '\r') test "$(curl -s -o /dev/null -w '%{http_code}' -H "Authorization: Bearer $CONSUMER_TOKEN" \ -H "If-None-Match: $ETAG" http://localhost:8081/api/v1/blocklist)" = "304" # Three policies, three different counts after seeding scored data # (Seed at least one IP with a high enough score that paranoid catches it but strict doesn't.) # Detailed seeding handled by an integration test; here just verify the preview endpoint differs: for P in strict moderate paranoid; do PID=$(curl -s -H "Authorization: Bearer $ADMIN_TOKEN" http://localhost:8081/api/v1/admin/policies \ | php -r "\$j=json_decode(stream_get_contents(STDIN),true); foreach(\$j['items'] as \$p){if(\$p['name']==='$P'){echo \$p['id'];break;}}") curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \ http://localhost:8081/api/v1/admin/policies/$PID/preview echo done # Token wrong kind: admin can't pull blocklist test "$(curl -s -o /dev/null -w '%{http_code}' -H "Authorization: Bearer $ADMIN_TOKEN" \ http://localhost:8081/api/v1/blocklist)" = "401" docker compose down -v ``` ## Handoff 1. Commit: ``` feat(M07): policies, blocklist distribution endpoint - policy CRUD with thresholds (replaces wholesale on PATCH) - GET /api/v1/blocklist (text + json), ETag with If-None-Match round-trip - per-policy 30s cache, invalidated on relevant mutations - BlocklistBuilder with allowlist filtering and manual-block dedup - perf test: 50k entries < 500ms (sqlite) ``` 2. Append to `PROGRESS.md`: ```markdown ## M07 — Policies & distribution (done) **Built:** policy CRUD, blocklist endpoint, preview endpoint, ETag, perf-tested at 50k entries. **Notes for next milestone:** - Per-policy cache TTL = 30s. Mutation endpoints invalidate the cache for affected policies. - The text/plain format is universal; firewall-specific consumers transform on their side. Examples land in M13. - DELETE on a policy with consumers returns 409 with the consumer list. - Performance: SQLite hits the 500ms target with [add measured number]. MySQL [add measured number]. **Deviations from SPEC:** [list any, e.g. additional index added] **Added dependencies:** none. ``` 3. **Stop.** Do not start M08.