M14 — Security Hardening

Fresh Claude Code agent prompt. M13 must be complete and committed. Estimated effort: medium.

Mission

Harden both containers: security headers, full brute-force lockout for local admin, audit secret-scrubbing in logs, token entropy verification, backup guidance verification, expired-manual-block cleanup. By the end, a security review checklist passes.

Before you start

Verify M13:
```
git log --oneline -13
```
Read SPEC.md §8 (auth, especially CSRF/sessions), §10 (backup notes), §12 M14 (the hardening milestone — your reference).
The OWASP top 10 is a useful mental model for this milestone. Don't take it as a checklist; do treat it as "did I think about each of these?"

Tasks

1. Security headers

In both api and ui Caddy configs, add a header bundle on every response:

Strict-Transport-Security: max-age=31536000; includeSubDomains — only when APP_ENV=production. Don't HSTS in dev or you'll lock yourself out of localhost.
X-Content-Type-Options: nosniff
X-Frame-Options: DENY (UI) / X-Frame-Options: SAMEORIGIN (api)
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy: geolocation=(), microphone=(), camera=()
CSP for the UI:
- default-src 'self'
- script-src 'self' 'wasm-unsafe-eval' (Alpine doesn't need unsafe-eval; only allow it if a build dep demands it)
- style-src 'self' 'unsafe-inline' (Tailwind compiled, but inline styles for dynamic things like score bars)
- img-src 'self' data: (data: for tiny inline icons)
- connect-src 'self' <API_BASE_URL> if the UI ever does direct browser→api calls (it doesn't today; but htmx might add one)
- frame-ancestors 'none'
- base-uri 'self'
- form-action 'self'
- Test that the UI doesn't violate its own CSP. Run a browser, check the console, fix any violations by either tightening the page's HTML or relaxing CSP minimally with comment justification.
CSP for the api: very restrictive (default-src 'none'; frame-ancestors 'none') since the api serves only JSON, the OpenAPI viewer, and YAML. The /api/docs page does need styles+scripts for RapiDoc/Elements; relax CSP only on that route.

2. Local admin brute-force lockout (full)

Replace M08's basic 5/30s throttle with a persistent lockout:

Track failed attempts per (LOCAL_ADMIN_USERNAME, source_ip) pair in a small in-memory store (singleton service in the ui container) plus the session.
Failure progression: 1–4 failures fast retry; 5 failures → 1-minute lockout; 10 → 5-minute lockout; 15+ → 30-minute lockout. Reset the counter on a successful login.
Lock by username AND by IP separately so attackers can't lock out the legitimate admin from another IP.
Log every failure at WARN, every lockout at ERROR, with the source IP. Don't log the attempted password.
Document in doc/auth-flows.md (update from M13) — including how to clear a lockout (restart the ui container, since this is in-memory; the lockout is intentionally short enough that this is rarely needed).

3. Token entropy verification

In api/tests/Unit/Auth/:

TokenEntropyTest.php — generates 1000 tokens, asserts ≥160 bits of unique randomness (in practice, all-distinct).
Verifies the format irdb_<3>_<32 base32 chars>.
Confirms random_bytes (CSPRNG) is the source.

4. Logs scrubbed of secrets

Audit all log output paths. Search the codebase for places that might log:
- Bearer tokens (any Authorization header content).
- LOCAL_ADMIN_PASSWORD_HASH.
- OIDC_CLIENT_SECRET.
- MAXMIND_LICENSE_KEY.
- Database passwords.
Add a Monolog processor that scrubs known-sensitive keys from the context array before formatting. Pattern:
```
['authorization' => 'Bearer abc...'] → ['authorization' => 'Bearer ***']
```
Add a test that constructs a log record with a Bearer token in context and asserts the formatted output is scrubbed.

5. Expired manual block cleanup

A small loose end from M06: manual blocks have expires_at but nothing prunes expired ones. Two approaches:

Filter at read time: every read of manual_blocks ignores rows with expires_at < now. The CidrEvaluator already could do this — verify and fix if not. Pros: zero new infrastructure. Cons: rows accumulate.
Add a cleanup job: register CleanupExpiredManualBlocksJob that deletes them daily.

Recommended: do both. Filter at read for correctness, prune in a daily job for tidiness.

If adding a job: register it, add an audit entry per delete, verify with a test.

6. Rate limiting beyond the public API

The current rate limiter applies only to public API endpoints. Add a soft limit to login attempts on the UI (covered by §2 above).
Consider whether admin endpoints need a limit. Real abuse on admin endpoints is rare (Bearer-authed humans/UI). Leave admin unrated unless you measure a problem.
Document the rate-limit posture in doc/api-overview.md (update from M13).

7. Backups

Verify M13's README has clear instructions for:

SQLite + Docker volume: docker run --rm -v irdb-data:/data -v $(pwd):/backup alpine tar czf /backup/irdb-backup.tar.gz -C /data . — describe the equivalent restore.
MySQL: mysqldump example via docker compose exec.
Restore: the inverse, with the api container stopped during restore.
What to NOT back up: rotating tokens (they're recoverable), GeoIP DBs (re-downloadable).

Add to doc/architecture.md (update from M13): a "Disaster Recovery" subsection covering the same.

8. Secrets at rest verification

Confirm tokens are never stored in plaintext (M03 work; verify with a manual SQL inspection).
Confirm no secret values appear in audit_log.payload.
Confirm /api/v1/admin/config masks all the secrets it should (M12).
Add a regression test that scans the schema for any column literally named password or containing _secret and asserts none store unhashed values (best-effort sanity check).

9. Dependency vulnerability scan

Add a CI job: composer audit (PHP) and npm audit --omit=dev (UI). Fail on critical/high.
Document the policy: when an audit fails, an admin reviews and either patches or accepts with a documented exception.

10. Final security review checklist

Add doc/security.md capturing the actual posture: authn, authz, transport, data at rest, secrets management, logging, rate limits, supply chain. Concrete, factual, ≤300 lines. Do not make claims you can't back up.

Implementation notes

CSP iteration: enable in "Report-Only" mode first if you want a faster cycle (Content-Security-Policy-Report-Only), check the browser console, then switch to enforcing.
HSTS gotcha: HSTS is sticky in browsers. If you turn it on in dev with localhost, you may break local development for yourself. Gate strictly on APP_ENV=production.
Brute-force lockout vs UX: too aggressive = legit admins lock themselves out. The 1/5/30 progression is moderate. Don't go to "permanent ban" — the local admin path is a recovery channel, not a daily-use channel.
Auditing the auditor: changes to audit_log config (retention, etc.) should themselves be audited. Verify the M12 emitter wraps any settings endpoint that touches audit retention.
Don't introduce new attack surface in the name of "hardening": e.g., don't add a "lockout-clear" endpoint reachable from the API. Reset is via container restart; that's safer.

Out of scope (DO NOT)

WAF rules, IPS integration, fail2ban for the admin UI itself. Out of scope.
2FA on local admin. Use OIDC for that.
mTLS between containers. The Docker network isolation is the trust boundary; documenting that is enough.
Penetration test report. The agent is not a pentester.
Encryption at rest of the SQLite file. The volume's host-level disk encryption is the right layer.
Audit log signing / tamper-evidence. Future work.

Acceptance

cd api && composer cs && composer stan && composer test && cd ..
cd ui  && composer cs && composer stan && composer test && cd ..

# composer + npm audit
cd api && composer audit && cd ..
cd ui  && npm ci && npm audit --omit=dev && cd ..

docker compose down -v
cp .env.example .env
docker compose up -d
sleep 15

# Security headers present on UI
HEADERS=$(curl -sI http://localhost:8080/login)
echo "$HEADERS" | grep -qi "X-Content-Type-Options: nosniff"
echo "$HEADERS" | grep -qi "X-Frame-Options: DENY"
echo "$HEADERS" | grep -qi "Content-Security-Policy:"
echo "$HEADERS" | grep -qi "Referrer-Policy:"

# Headers on API
HEADERS=$(curl -sI http://localhost:8081/healthz)
echo "$HEADERS" | grep -qi "X-Content-Type-Options: nosniff"
echo "$HEADERS" | grep -qi "X-Frame-Options:"

# In production mode, HSTS appears (skip if not testing prod)
# HEADERS=$(APP_ENV=production curl -sI ...) — manual

# Local admin lockout: 5 fails should trigger lockout
COOKIE=$(mktemp)
for i in 1 2 3 4 5; do
  CSRF=$(curl -s -c $COOKIE http://localhost:8080/login | grep -oE 'name="csrf_token" value="[^"]+"' | cut -d'"' -f4)
  curl -s -b $COOKIE -c $COOKIE -X POST \
    -d "csrf_token=$CSRF&username=admin&password=WRONG" \
    http://localhost:8080/login/local > /dev/null
done
CSRF=$(curl -s -c $COOKIE http://localhost:8080/login | grep -oE 'name="csrf_token" value="[^"]+"' | cut -d'"' -f4)
RESP=$(curl -s -b $COOKIE -c $COOKIE -X POST \
  -d "csrf_token=$CSRF&username=admin&password=test1234" \
  http://localhost:8080/login/local -L)
echo "$RESP" | grep -qi "locked\|too many\|wait"

# Bearer tokens never appear unmasked in logs
docker compose logs 2>&1 | grep -E "Bearer irdb_(rep|con|adm|svc)_[A-Z2-7]+" && \
  { echo "TOKEN LEAKED IN LOGS"; exit 1; } || true

# Token entropy test passes
cd api && vendor/bin/phpunit --filter TokenEntropyTest && cd ..

# Expired manual block test (insert one with a past expires_at, run cleanup, verify it's gone or filtered)
ADMIN_TOKEN=$(docker compose exec -T api php bin/console auth:create-token --kind=admin --role=admin --quiet)
INTERNAL_TOKEN=$(grep ^INTERNAL_JOB_TOKEN= .env | cut -d= -f2)
curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \
  -d '{"kind":"ip","ip":"203.0.113.250","reason":"expired test","expires_at":"2020-01-01T00:00:00Z"}' \
  http://localhost:8081/api/v1/admin/manual-blocks > /dev/null
# Run cleanup if you added a job; otherwise just verify the read-time filter:
curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \
  http://localhost:8081/api/v1/admin/manual-blocks | grep -v "203.0.113.250"

# Quick CSP smoke test: load the UI in headless chrome (manual or via puppeteer in CI), no CSP violations
# (omit if no headless browser available; rely on developer manual verification)

docker compose down -v

Handoff

Commit:

feat(M14): security hardening

- CSP, HSTS (prod), X-Content-Type-Options, X-Frame-Options, Referrer-Policy
- local admin brute-force lockout (1/5/30 progression, by user+ip)
- log scrubbing of Bearer tokens and known secrets via Monolog processor
- token entropy regression test
- expired manual block read-time filter + daily cleanup job
- composer audit + npm audit in CI
- doc/security.md describing posture; backup/restore in README and architecture.md

Append to PROGRESS.md:

## M14 — Hardening (done)

**Built:** security headers, lockout, log scrubbing, audits, doc/security.md.

**Production checklist (run before exposing to internet):**
- APP_ENV=production
- Real OIDC tenant configured
- Strong LOCAL_ADMIN_PASSWORD_HASH or LOCAL_ADMIN_ENABLED=false
- Reverse proxy with TLS in front
- Backups configured
- composer audit / npm audit clean
- Logs piped to your aggregator
- MAXMIND_LICENSE_KEY set so refresh-geoip works
- Scheduler running (host cron / systemd / sidecar)

**Known limitations:**
- In-process rate limiter and lockout state are per-replica.
- Audit log is append-only but not tamper-evident; sign+chain is future work.
- No 2FA on local admin (use OIDC instead).

**Build complete.** All 14 milestones executed.

Stop. Final milestone reached.

M14-hardening.md 12 KB Historik Rå