1
0

M04-token-system-and-ingest.md 11 KB

M04 — Token Management & Ingest API

Fresh Claude Code agent prompt. M03 must be complete and committed. Estimated effort: medium.

Mission

Implement reporter and consumer CRUD plus token issuance via admin endpoints, the public POST /api/v1/report endpoint with synchronous ip_scores updates, and a per-token rate limiter. After this milestone, machine clients can report IPs and rate limits actually bite.

Before you start

  1. Verify M03:

    git log --oneline -3
    cd api && composer test && composer stan && cd ..
    
  2. Read SPEC.md §4 (reporters, consumers, api_tokens, reports, ip_scores tables), §5 (Reputation Engine — scoring formula; you'll write the synchronous-update piece, but the bulk recompute is M05), §6 (API Contracts — Public API and the relevant Admin endpoints).

  3. Confirm clean tree.

Tasks

1. Reporter & Consumer admin CRUD

In api/src/Application/Admin/:

  • ReportersController.php:
    • GET /api/v1/admin/reporters — list, paginated.
    • GET /api/v1/admin/reporters/{id} — detail.
    • POST /api/v1/admin/reporters{name, description, trust_weight}. Returns the created record.
    • PATCH /api/v1/admin/reporters/{id} — partial update.
    • DELETE /api/v1/admin/reporters/{id} — soft delete (set is_active=false). Hard delete refused if reports exist (409).
  • ConsumersController.php — analogous, with policy_id instead of trust_weight. (Policy CRUD is M07; for now the FK is required and the UI will pass an existing policy id; in tests, you may seed a policy directly.)

RBAC: all reporter/consumer endpoints require Admin role.

2. Token issuance & management

In api/src/Application/Admin/:

  • TokensController.php:
    • GET /api/v1/admin/tokens — list. Never include service-kind tokens. Return prefix and metadata; never the raw token (it's not stored).
    • POST /api/v1/admin/tokens — body {kind: "reporter"|"consumer"|"admin", reporter_id?, consumer_id?, role?, expires_at?}. Validate constraints:
    • kind=reporterreporter_id required, no role, no consumer_id.
    • kind=consumerconsumer_id required, no role, no reporter_id.
    • kind=adminrole required, no FKs.
    • kind=service → 400 always (service tokens cannot be created via API).
    • Returns {id, kind, prefix, raw_token, ...}raw_token appears only in this response; document this in OpenAPI later.
    • DELETE /api/v1/admin/tokens/{id} — sets revoked_at = now(). Refuse on service tokens.

RBAC: Admin role. Audit emission deferred to M12.

3. Public ingest: POST /api/v1/report

In api/src/Application/Public/ReportController.php:

  • Auth: TokenKind::Reporter only. Reject all other kinds with 401 (wrong kind = generic unauthorized per M03 convention).
  • Body validation: ip (parse via IpAddress::fromString, 400 on failure), category (slug; lookup by categories.slug, 400 if unknown or is_active=false), metadata (optional, must be a JSON object ≤4 KB after re-encoding).
  • Insert a row into reports:
    • weight_at_report = current trust_weight of the reporter (snapshot).
    • received_at = current UTC time via injected Clock.
  • Update ip_scores for the affected (ip_bin, category_id) pair synchronously:
    • Compute the new score by re-running the formula (Σ weight × decay over reports for this ip+category, hard cutoff 365 days). The bulk recompute service lands in M05 — but for the synchronous-on-ingest path you need a small helper now. Place it at api/src/Domain/Reputation/PairScorer.php so M05 can build on it.
    • UPSERT the score row.
  • Return 202 with {report_id, ip, received_at}.

4. Rate limiter

In api/src/Infrastructure/Http/Middleware/RateLimitMiddleware.php:

  • Token-bucket per token id. In-process state (PHP array attached to a singleton service); good enough for single-replica deployments and dev.
  • Bucket: capacity = API_RATE_LIMIT_PER_SECOND × 2, refill rate = API_RATE_LIMIT_PER_SECOND per second. Configurable.
  • On exhaustion: return 429 with Retry-After: 1 (seconds, integer).
  • Apply to public endpoints (/api/v1/report, future /api/v1/blocklist). Skip for admin endpoints (admins are humans/UI; not a DDoS vector).
  • Tests: 60 requests in <1s with limit=60 → all 200/202; 120 in <1s → some 429s.

Note for self: in-process means each replica has its own bucket. Document this in PROGRESS.md as a known limitation; multi-replica rate limiting needs a shared store and is out of scope.

5. Validation framework

You'll need consistent request validation across this and future milestones. Two acceptable approaches:

  • Hand-rolled in each controller (acceptable for this scale).
  • Lightweight library like respect/validation (allowed; document in PROGRESS.md if added).

Either way, validation errors must produce a uniform response:

{"error":"validation_failed","details":{"field":"reason"}}

HTTP status 400 for malformed; 422 is also acceptable but be consistent.

Implementation notes

  • PairScorer signature: score(string $ipBin, int $categoryId, DateTimeImmutable $now): float. Reads from reports, applies category-specific decay (linear or exponential per categories.decay_function and decay_param). Hard cutoff at SCORE_REPORT_HARD_CUTOFF_DAYS (default 365). Returns the float score.
  • Decay functions: defined in SPEC §5. Linear: max(0, 1 - age_days/decay_param). Exponential: 0.5 ^ (age_days/decay_param). Implement them in api/src/Domain/Reputation/Decay.php as pure functions with unit tests.
  • ip_scores upsert: use the DBAL adapter's UPSERT-equivalent. SQLite: INSERT ... ON CONFLICT(ip_bin, category_id) DO UPDATE. MySQL: INSERT ... ON DUPLICATE KEY UPDATE. Wrap in RepositoryBase.
  • Report metadata size: enforce ≤4 KB after json_encode of the parsed object. Reject larger with 400.
  • IPv6 metadata: just store the metadata as JSON — no special handling.
  • Rate limit in tests: inject the limiter so tests can either bypass it or fast-forward time via the Clock. Use a ClockInterface (you already created it for received_at).
  • Reports are append-only: never UPDATE or DELETE rows in reports. The ingest endpoint just inserts.

Out of scope (DO NOT)

  • Bulk recompute / decay of all stored scores. M05.
  • Internal job endpoints. M05.
  • Allowlist / manual blocks. M06.
  • Distribution endpoint (/api/v1/blocklist). M07.
  • Audit log emission. M12.
  • Any UI changes.
  • GeoIP / enrichment. M11.
  • New dependencies beyond what's already in api/composer.json (one optional: respect/validation if you go that route — record in PROGRESS.md).

Acceptance

cd api && composer cs && composer stan && composer test && cd ..

docker compose down -v
cp .env.example .env  # fill secrets if not already done
docker compose up -d
sleep 15

# Create a reporter and a token
ADMIN_TOKEN=$(docker compose exec -T api php bin/console auth:create-token --kind=admin --role=admin --quiet)

REPORTER=$(curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \
  -d '{"name":"web-prod-01","description":"prod webserver","trust_weight":1.0}' \
  http://localhost:8081/api/v1/admin/reporters)
REPORTER_ID=$(echo "$REPORTER" | php -r 'echo json_decode(stream_get_contents(STDIN), true)["id"];')

TOKEN_RESP=$(curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \
  -d "{\"kind\":\"reporter\",\"reporter_id\":$REPORTER_ID}" \
  http://localhost:8081/api/v1/admin/tokens)
RAW_TOKEN=$(echo "$TOKEN_RESP" | php -r 'echo json_decode(stream_get_contents(STDIN), true)["raw_token"];')
[ -n "$RAW_TOKEN" ]

# Submit a report
RESP=$(curl -s -X POST -H "Authorization: Bearer $RAW_TOKEN" -H "Content-Type: application/json" \
  -d '{"ip":"203.0.113.42","category":"brute_force","metadata":{"url":"/wp-login"}}' \
  http://localhost:8081/api/v1/report)
echo "$RESP" | grep -q '"report_id"'
echo "$RESP" | grep -q '"received_at"'

# ip_scores updated synchronously
docker compose exec -T api sqlite3 /data/irdb.sqlite \
  "SELECT ROUND(score, 4) FROM ip_scores WHERE ip_text='203.0.113.42';" | grep -q '^[0-9]'

# Wrong-kind token rejected (admin token can't report)
test "$(curl -s -o /dev/null -w '%{http_code}' \
  -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \
  -d '{"ip":"1.2.3.4","category":"spam"}' \
  http://localhost:8081/api/v1/report)" = "401"

# Bad IP rejected
test "$(curl -s -o /dev/null -w '%{http_code}' \
  -X POST -H "Authorization: Bearer $RAW_TOKEN" -H "Content-Type: application/json" \
  -d '{"ip":"not-an-ip","category":"spam"}' \
  http://localhost:8081/api/v1/report)" = "400"

# Unknown category rejected
test "$(curl -s -o /dev/null -w '%{http_code}' \
  -X POST -H "Authorization: Bearer $RAW_TOKEN" -H "Content-Type: application/json" \
  -d '{"ip":"1.2.3.4","category":"nonexistent"}' \
  http://localhost:8081/api/v1/report)" = "400"

# Rate limit kicks in (with low API_RATE_LIMIT_PER_SECOND)
docker compose down
echo "API_RATE_LIMIT_PER_SECOND=2" >> .env
docker compose up -d
sleep 10
HITS_429=0
for i in $(seq 1 20); do
  CODE=$(curl -s -o /dev/null -w '%{http_code}' -X POST \
    -H "Authorization: Bearer $RAW_TOKEN" -H "Content-Type: application/json" \
    -d '{"ip":"1.2.3.4","category":"spam"}' \
    http://localhost:8081/api/v1/report)
  [ "$CODE" = "429" ] && HITS_429=$((HITS_429+1))
done
[ "$HITS_429" -gt 0 ]

docker compose down -v

Handoff

  1. Commit:

    feat(M04): reporter/consumer CRUD, token issuance, ingest API, rate limiter
    
    - admin endpoints for reporters, consumers, tokens (raw token shown once)
    - POST /api/v1/report with synchronous ip_scores update via PairScorer
    - decay functions (linear + exponential) with unit tests
    - per-token in-process rate limiter on public endpoints
    
  2. Append to PROGRESS.md:

    ## M04 — Token system & ingest (done)
    
    **Built:** reporter/consumer/token CRUD; POST /api/v1/report end-to-end; rate limiter; decay functions.
    
    **Notes for next milestone:**
    - Synchronous score updates are correct but only touch the (ip, category) pair just reported. Bulk decay re-application is M05's recompute job.
    - PairScorer is the authoritative single-pair scorer; the bulk recompute job in M05 should call into it (or a near-clone) so behavior stays consistent.
    - Rate limiter is in-process; document this in README. Multi-replica deployments need a shared store.
    - Service tokens cannot be created via the admin API; only the bootstrap path makes them.
    
    **Deviations from SPEC:** none.
    **Added dependencies:** [list any, e.g. respect/validation, or "none"].
    
  3. Stop. Do not start M05.