M02-database-migrations.md 9.3 KB

M02 — Database & Migrations (api)

Fresh Claude Code agent prompt. M01 must be complete and committed. Estimated effort: medium.

Mission

Define every database table from SPEC.md §4 as Phinx migrations, write seeds for default categories and policies, build a robust IP/CIDR normalization helper with thorough tests, and verify migrations run cleanly on both SQLite and MySQL.

Before you start

  1. Verify M01 acceptance:

    git log --oneline -1            # last commit should reference M01
    docker compose build            # must succeed
    cat PROGRESS.md                 # M01 entry present
    
  2. Read SPEC.md §2 (tech stack), §4 (Data Model — every table), §5 (Reputation Engine, for context on what ip_scores will store), §14 (Coding Conventions).

  3. Confirm working tree is clean: git status.

Tasks

1. DBAL setup

In api/src/Infrastructure/Db/:

  • ConnectionFactory.php — builds a Doctrine\DBAL\Connection from settings. Selects driver from DB_DRIVER env. For SQLite, executes the four PRAGMA statements from SPEC §10 on each connection.
  • RepositoryBase.php — abstract base with Connection injection and helpers for inserting + fetching binary IP columns.
  • Wire the connection into the DI container.

2. Migrations

One migration file per table. File names: YYYYMMDDHHMMSS_create_<table>.php. Order matters because of foreign keys.

Required tables (all from SPEC §4 — read it carefully, do not omit columns):

  1. users
  2. oidc_role_mappings
  3. reporters
  4. consumers
  5. policies
  6. policy_category_thresholds — composite PK (policy_id, category_id)
  7. categories (seed comes later)
  8. api_tokens — note the constraint that exactly one of reporter_id/consumer_id is set, matching kind. On SQLite use a CHECK constraint; on MySQL the same. Phinx supports both.
  9. reports — index (ip_bin, category_id, received_at DESC).
  10. ip_scores — composite PK (ip_bin, category_id).
  11. ip_enrichment — PK ip_bin.
  12. manual_blocks
  13. allowlist
  14. audit_log
  15. job_locks — PK job_name.
  16. job_runs — index (job_name, started_at DESC).

Cross-cutting requirements:

  • Timestamps: SQLite uses TEXT (ISO 8601 strings, default CURRENT_TIMESTAMP); MySQL uses DATETIME(6). Phinx supports both via 'timestamp' / 'datetime' types — use the right one per adapter, or use a custom helper that picks based on the adapter.
  • IP columns: ip_bin = BINARY(16) on MySQL, BLOB on SQLite. ip_text = VARCHAR(45).
  • Subnets: network_bin = BINARY(16)/BLOB; prefix_length = SMALLINT.
  • Foreign keys: declare them; ON DELETE semantics per common sense (api_tokens.reporter_id → cascade on reporter delete; reports.reporter_id → SET NULL or RESTRICT — choose RESTRICT to preserve audit trail).
  • Indexes: at minimum, every FK column is indexed. Add (ip_bin) indexes on reports, manual_blocks, allowlist.

3. Seeds

In api/db/seeds/:

  • DefaultCategoriesSeeder.php — five categories: brute_force, spam, scanner, malware_c2, web_attack. Sensible names and descriptions. Default decay: exponential, half-life 14 days.
  • DefaultPoliciesSeeder.php — three policies: strict, moderate, paranoid. Each with thresholds across all five categories. Pick numbers that produce visibly different blocklists (e.g. paranoid thresholds at 0.3, moderate at 1.0, strict at 2.5).

Seeders are idempotent: check if a row exists by slug/name before inserting.

4. IP normalization helper

In api/src/Domain/Ip/:

  • IpAddress.php — value object. Static fromString(string): self, throws InvalidIpException on garbage. Stores: canonical text form (lowercase, no leading zeros for v6), 16-byte binary (v4 mapped into ::ffff:0:0/96), and a flag indicating whether the input was originally v4. Provides binary(): string, text(): string, isIpv4(): bool.
  • Cidr.php — value object. Static fromString(string): self. Stores network as 16-byte binary, prefix length 0–128. For v4 CIDRs the prefix is internally stored as 96 + originalPrefix so containment math is uniform across families. Provides contains(IpAddress): bool, network(): string, prefixLength(): int, text(): string.
  • InvalidIpException.php, InvalidCidrException.php.

Tests in api/tests/Unit/Ip/:

  • IpAddressTest.php — at least 30 cases: dotted-quad v4, full v6, zero-compressed v6, ::ffff:1.2.3.4, leading zeros (rejected per RFC), garbage strings, empty string, integers, addresses with whitespace.
  • CidrTest.php — containment edge cases: v4 in v4, v4 in v4-mapped-v6 CIDR, v6 in v6, v6 not in v4, prefix 0 (matches all of family), prefix 32/128 (matches single).

Coverage target ≥95% on src/Domain/Ip/.

5. CLI commands

Extend api/bin/console:

  • db:migrate (already exists) — runs Phinx migrations.
  • db:seed — runs all seeders idempotently.
  • db:rollback — Phinx rollback (for dev use).

Implementation notes

  • DBAL vs Phinx: Phinx owns schema; DBAL owns runtime queries. Don't query the DB inside migrations using DBAL — use Phinx's adapter API.
  • Binary columns on SQLite: PDO returns BLOB columns as PHP strings (octet sequences). Be explicit: when fetching ip_bin, treat the value as raw bytes; when binding, use \PDO::PARAM_LOB or pass-through string. Hide this in RepositoryBase.
  • MySQL STRICT_TRANS_TABLES: assume strict mode is on. Don't rely on lax type coercion.
  • Migration testability: write a tests/Integration/MigrationsTest.php that runs migrations against an in-memory SQLite, then introspects the schema (table names, key columns) to assert structure. Also run in CI against MySQL via a service container.
  • Seeders in tests: integration tests should call seeders to set up baseline state.
  • Don't seed users or tokens. That's M03's job — auth doesn't exist yet.

Out of scope (DO NOT)

  • No auth logic, no token model behavior, no RBAC. Migrations create the tables; population and behavior come in M03/M04.
  • No HTTP routes beyond what M01 already created.
  • No reputation calculation logic; ip_scores table exists but no service writes to it yet.
  • No GeoIP integration; ip_enrichment table exists but no enrichment service.
  • No ui changes whatsoever. UI is untouched in this milestone.
  • Do not add helpers, services, or domain classes outside src/Domain/Ip/ and src/Infrastructure/Db/.

Acceptance

# Lint and unit tests
cd api && composer cs && composer stan && composer test && cd ..

# Migrations against SQLite (in-memory)
cd api && DB_DRIVER=sqlite DB_SQLITE_PATH=:memory: ./bin/console db:migrate && cd ..

# Migrations against ephemeral MySQL via docker
docker run -d --rm --name irdb-mysql-test \
  -e MYSQL_ROOT_PASSWORD=root \
  -e MYSQL_DATABASE=irdb \
  -p 33306:3306 \
  mysql:8 --default-authentication-plugin=mysql_native_password
# wait for mysql ready
for i in {1..60}; do docker exec irdb-mysql-test mysqladmin -uroot -proot ping >/dev/null 2>&1 && break; sleep 1; done
cd api && DB_DRIVER=mysql DB_MYSQL_HOST=127.0.0.1 DB_MYSQL_PORT=33306 \
  DB_MYSQL_DATABASE=irdb DB_MYSQL_USERNAME=root DB_MYSQL_PASSWORD=root \
  ./bin/console db:migrate && \
  ./bin/console db:seed && cd ..
docker stop irdb-mysql-test

# Verify seed counts (rerun against a fresh sqlite file to avoid the in-memory db disappearing)
rm -f /tmp/irdb-test.sqlite
cd api && DB_DRIVER=sqlite DB_SQLITE_PATH=/tmp/irdb-test.sqlite ./bin/console db:migrate && \
  DB_DRIVER=sqlite DB_SQLITE_PATH=/tmp/irdb-test.sqlite ./bin/console db:seed && cd ..
sqlite3 /tmp/irdb-test.sqlite "SELECT COUNT(*) FROM categories;" | grep -q '^5$'
sqlite3 /tmp/irdb-test.sqlite "SELECT COUNT(*) FROM policies;"   | grep -q '^3$'
sqlite3 /tmp/irdb-test.sqlite "SELECT COUNT(*) FROM policy_category_thresholds;" | grep -q '^15$'

# Idempotent re-seed produces no errors and same counts
cd api && DB_DRIVER=sqlite DB_SQLITE_PATH=/tmp/irdb-test.sqlite ./bin/console db:seed && cd ..
sqlite3 /tmp/irdb-test.sqlite "SELECT COUNT(*) FROM categories;" | grep -q '^5$'

# Coverage check on Ip domain
cd api && vendor/bin/phpunit --coverage-text --filter Ip tests/Unit/Ip 2>&1 | grep -E "Domain.Ip" | awk '{print $NF}' | grep -E "9[5-9]\.|100\.00"

Handoff

  1. Commit:

    feat(M02): database schema, migrations, seeds, IP/CIDR helpers
    
    - phinx migrations for all SPEC §4 tables (sqlite + mysql)
    - default seeds: 5 categories, 3 policies (strict/moderate/paranoid)
    - IpAddress and Cidr value objects with ≥95% coverage
    - DBAL connection factory with SQLite WAL pragmas
    
  2. Append to PROGRESS.md:

    ## M02 — Database & migrations (done)
    
    **Built:** all SPEC §4 tables; idempotent seeds; IP/CIDR value objects.
    
    **Schema notes for next milestone:**
    - `users.password_hash` is NOT in the schema (per SPEC §4; UI owns local-admin credentials).
    - `api_tokens.kind` enum values: `reporter`, `consumer`, `admin`, `service` (constraint enforced).
    - All timestamps stored UTC. ISO 8601 strings on SQLite, DATETIME(6) on MySQL.
    - `ip_bin` always 16 bytes; v4 mapped to `::ffff:0:0/96`. Use `IpAddress::fromString()` for normalization.
    
    **Decisions made:**
    - [Document any FK ON DELETE choices the agent had to make.]
    
    **Deviations from SPEC:** none.
    **Added dependencies:** none beyond SPEC §2.
    
  3. Stop. Do not start M03.