# M14 — Security Hardening > Fresh Claude Code agent prompt. M13 must be complete and committed. > Estimated effort: medium. ## Mission Harden both containers: security headers, full brute-force lockout for local admin, audit secret-scrubbing in logs, token entropy verification, backup guidance verification, expired-manual-block cleanup. By the end, a security review checklist passes. ## Before you start 1. Verify M13: ```bash git log --oneline -13 ``` 2. Read `SPEC.md` §8 (auth, especially CSRF/sessions), §10 (backup notes), §12 M14 (the hardening milestone — your reference). 3. The OWASP top 10 is a useful mental model for this milestone. Don't take it as a checklist; do treat it as "did I think about each of these?" ## Tasks ### 1. Security headers In both `api` and `ui` Caddy configs, add a header bundle on every response: - `Strict-Transport-Security: max-age=31536000; includeSubDomains` — only when `APP_ENV=production`. Don't HSTS in dev or you'll lock yourself out of localhost. - `X-Content-Type-Options: nosniff` - `X-Frame-Options: DENY` (UI) / `X-Frame-Options: SAMEORIGIN` (api) - `Referrer-Policy: strict-origin-when-cross-origin` - `Permissions-Policy: geolocation=(), microphone=(), camera=()` - **CSP for the UI**: - `default-src 'self'` - `script-src 'self' 'wasm-unsafe-eval'` (Alpine doesn't need `unsafe-eval`; only allow it if a build dep demands it) - `style-src 'self' 'unsafe-inline'` (Tailwind compiled, but inline styles for dynamic things like score bars) - `img-src 'self' data:` (data: for tiny inline icons) - `connect-src 'self' ` if the UI ever does direct browser→api calls (it doesn't today; but htmx might add one) - `frame-ancestors 'none'` - `base-uri 'self'` - `form-action 'self'` - Test that the UI doesn't violate its own CSP. Run a browser, check the console, fix any violations by either tightening the page's HTML or relaxing CSP minimally with comment justification. - **CSP for the api**: very restrictive (`default-src 'none'; frame-ancestors 'none'`) since the api serves only JSON, the OpenAPI viewer, and YAML. The `/api/docs` page does need styles+scripts for RapiDoc/Elements; relax CSP only on that route. ### 2. Local admin brute-force lockout (full) Replace M08's basic 5/30s throttle with a persistent lockout: - Track failed attempts per `(LOCAL_ADMIN_USERNAME, source_ip)` pair in a small in-memory store (singleton service in the ui container) plus the session. - Failure progression: 1–4 failures fast retry; 5 failures → 1-minute lockout; 10 → 5-minute lockout; 15+ → 30-minute lockout. Reset the counter on a successful login. - Lock by username AND by IP separately so attackers can't lock out the legitimate admin from another IP. - Log every failure at WARN, every lockout at ERROR, with the source IP. Don't log the attempted password. - Document in `doc/auth-flows.md` (update from M13) — including how to clear a lockout (restart the ui container, since this is in-memory; the lockout is intentionally short enough that this is rarely needed). ### 3. Token entropy verification In `api/tests/Unit/Auth/`: - `TokenEntropyTest.php` — generates 1000 tokens, asserts ≥160 bits of unique randomness (in practice, all-distinct). - Verifies the format `irdb_<3>_<32 base32 chars>`. - Confirms `random_bytes` (CSPRNG) is the source. ### 4. Logs scrubbed of secrets - Audit all log output paths. Search the codebase for places that might log: - Bearer tokens (any `Authorization` header content). - `LOCAL_ADMIN_PASSWORD_HASH`. - `OIDC_CLIENT_SECRET`. - `MAXMIND_LICENSE_KEY`. - Database passwords. - Add a Monolog processor that scrubs known-sensitive keys from the context array before formatting. Pattern: ``` ['authorization' => 'Bearer abc...'] → ['authorization' => 'Bearer ***'] ``` - Add a test that constructs a log record with a Bearer token in context and asserts the formatted output is scrubbed. ### 5. Expired manual block cleanup A small loose end from M06: manual blocks have `expires_at` but nothing prunes expired ones. Two approaches: - **Filter at read time**: every read of `manual_blocks` ignores rows with `expires_at < now`. The CidrEvaluator already could do this — verify and fix if not. Pros: zero new infrastructure. Cons: rows accumulate. - **Add a cleanup job**: register `CleanupExpiredManualBlocksJob` that deletes them daily. Recommended: do both. Filter at read for correctness, prune in a daily job for tidiness. If adding a job: register it, add an audit entry per delete, verify with a test. ### 6. Rate limiting beyond the public API - The current rate limiter applies only to public API endpoints. Add a soft limit to login attempts on the UI (covered by §2 above). - Consider whether admin endpoints need a limit. Real abuse on admin endpoints is rare (Bearer-authed humans/UI). Leave admin unrated unless you measure a problem. - Document the rate-limit posture in `doc/api-overview.md` (update from M13). ### 7. Backups Verify M13's README has clear instructions for: - **SQLite + Docker volume**: `docker run --rm -v irdb-data:/data -v $(pwd):/backup alpine tar czf /backup/irdb-backup.tar.gz -C /data .` — describe the equivalent restore. - **MySQL**: `mysqldump` example via `docker compose exec`. - **Restore**: the inverse, with the api container stopped during restore. - **What to NOT back up**: rotating tokens (they're recoverable), GeoIP DBs (re-downloadable). Add to `doc/architecture.md` (update from M13): a "Disaster Recovery" subsection covering the same. ### 8. Secrets at rest verification - Confirm tokens are never stored in plaintext (M03 work; verify with a manual SQL inspection). - Confirm no secret values appear in `audit_log.payload`. - Confirm `/api/v1/admin/config` masks all the secrets it should (M12). - Add a regression test that scans the schema for any column literally named `password` or containing `_secret` and asserts none store unhashed values (best-effort sanity check). ### 9. Dependency vulnerability scan - Add a CI job: `composer audit` (PHP) and `npm audit --omit=dev` (UI). Fail on critical/high. - Document the policy: when an audit fails, an admin reviews and either patches or accepts with a documented exception. ### 10. Final security review checklist Add `doc/security.md` capturing the actual posture: authn, authz, transport, data at rest, secrets management, logging, rate limits, supply chain. Concrete, factual, ≤300 lines. Do **not** make claims you can't back up. ## Implementation notes - **CSP iteration**: enable in "Report-Only" mode first if you want a faster cycle (`Content-Security-Policy-Report-Only`), check the browser console, then switch to enforcing. - **HSTS gotcha**: HSTS is sticky in browsers. If you turn it on in dev with `localhost`, you may break local development for yourself. Gate strictly on `APP_ENV=production`. - **Brute-force lockout vs UX**: too aggressive = legit admins lock themselves out. The 1/5/30 progression is moderate. Don't go to "permanent ban" — the local admin path is a recovery channel, not a daily-use channel. - **Auditing the auditor**: changes to `audit_log` config (retention, etc.) should themselves be audited. Verify the M12 emitter wraps any settings endpoint that touches audit retention. - **Don't introduce new attack surface in the name of "hardening"**: e.g., don't add a "lockout-clear" endpoint reachable from the API. Reset is via container restart; that's safer. ## Out of scope (DO NOT) - WAF rules, IPS integration, fail2ban for the admin UI itself. Out of scope. - 2FA on local admin. Use OIDC for that. - mTLS between containers. The Docker network isolation is the trust boundary; documenting that is enough. - Penetration test report. The agent is not a pentester. - Encryption at rest of the SQLite file. The volume's host-level disk encryption is the right layer. - Audit log signing / tamper-evidence. Future work. ## Acceptance ```bash cd api && composer cs && composer stan && composer test && cd .. cd ui && composer cs && composer stan && composer test && cd .. # composer + npm audit cd api && composer audit && cd .. cd ui && npm ci && npm audit --omit=dev && cd .. docker compose down -v cp .env.example .env docker compose up -d sleep 15 # Security headers present on UI HEADERS=$(curl -sI http://localhost:8080/login) echo "$HEADERS" | grep -qi "X-Content-Type-Options: nosniff" echo "$HEADERS" | grep -qi "X-Frame-Options: DENY" echo "$HEADERS" | grep -qi "Content-Security-Policy:" echo "$HEADERS" | grep -qi "Referrer-Policy:" # Headers on API HEADERS=$(curl -sI http://localhost:8081/healthz) echo "$HEADERS" | grep -qi "X-Content-Type-Options: nosniff" echo "$HEADERS" | grep -qi "X-Frame-Options:" # In production mode, HSTS appears (skip if not testing prod) # HEADERS=$(APP_ENV=production curl -sI ...) — manual # Local admin lockout: 5 fails should trigger lockout COOKIE=$(mktemp) for i in 1 2 3 4 5; do CSRF=$(curl -s -c $COOKIE http://localhost:8080/login | grep -oE 'name="csrf_token" value="[^"]+"' | cut -d'"' -f4) curl -s -b $COOKIE -c $COOKIE -X POST \ -d "csrf_token=$CSRF&username=admin&password=WRONG" \ http://localhost:8080/login/local > /dev/null done CSRF=$(curl -s -c $COOKIE http://localhost:8080/login | grep -oE 'name="csrf_token" value="[^"]+"' | cut -d'"' -f4) RESP=$(curl -s -b $COOKIE -c $COOKIE -X POST \ -d "csrf_token=$CSRF&username=admin&password=test1234" \ http://localhost:8080/login/local -L) echo "$RESP" | grep -qi "locked\|too many\|wait" # Bearer tokens never appear unmasked in logs docker compose logs 2>&1 | grep -E "Bearer irdb_(rep|con|adm|svc)_[A-Z2-7]+" && \ { echo "TOKEN LEAKED IN LOGS"; exit 1; } || true # Token entropy test passes cd api && vendor/bin/phpunit --filter TokenEntropyTest && cd .. # Expired manual block test (insert one with a past expires_at, run cleanup, verify it's gone or filtered) ADMIN_TOKEN=$(docker compose exec -T api php bin/console auth:create-token --kind=admin --role=admin --quiet) INTERNAL_TOKEN=$(grep ^INTERNAL_JOB_TOKEN= .env | cut -d= -f2) curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \ -d '{"kind":"ip","ip":"203.0.113.250","reason":"expired test","expires_at":"2020-01-01T00:00:00Z"}' \ http://localhost:8081/api/v1/admin/manual-blocks > /dev/null # Run cleanup if you added a job; otherwise just verify the read-time filter: curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \ http://localhost:8081/api/v1/admin/manual-blocks | grep -v "203.0.113.250" # Quick CSP smoke test: load the UI in headless chrome (manual or via puppeteer in CI), no CSP violations # (omit if no headless browser available; rely on developer manual verification) docker compose down -v ``` ## Handoff 1. Commit: ``` feat(M14): security hardening - CSP, HSTS (prod), X-Content-Type-Options, X-Frame-Options, Referrer-Policy - local admin brute-force lockout (1/5/30 progression, by user+ip) - log scrubbing of Bearer tokens and known secrets via Monolog processor - token entropy regression test - expired manual block read-time filter + daily cleanup job - composer audit + npm audit in CI - doc/security.md describing posture; backup/restore in README and architecture.md ``` 2. Append to `PROGRESS.md`: ```markdown ## M14 — Hardening (done) **Built:** security headers, lockout, log scrubbing, audits, doc/security.md. **Production checklist (run before exposing to internet):** - APP_ENV=production - Real OIDC tenant configured - Strong LOCAL_ADMIN_PASSWORD_HASH or LOCAL_ADMIN_ENABLED=false - Reverse proxy with TLS in front - Backups configured - composer audit / npm audit clean - Logs piped to your aggregator - MAXMIND_LICENSE_KEY set so refresh-geoip works - Scheduler running (host cron / systemd / sidecar) **Known limitations:** - In-process rate limiter and lockout state are per-replica. - Audit log is append-only but not tamper-evident; sign+chain is future work. - No 2FA on local admin (use OIDC instead). **Build complete.** All 14 milestones executed. ``` 3. **Stop.** Final milestone reached.