1
0

admin-manual.md 20 KB

IRDB Admin Manual

Audience: operators running the IRDB Compose stack on a host they own. Covers the day-to-day deployment lifecycle — updating, rebuilding, rolling back, troubleshooting, and disk hygiene. For the in-app admin workflows (creating reporters, tuning policies, reading the audit log) see user-manual.md. For architecture see architecture.md.

This document picks up where the README.md Quickstart leaves off: the stack is running, you have admin access, now you need to keep it running across upgrades, recover from breakage, and reason about the moving parts.


1. Update workflow

1.1 The standard loop

git pull
docker compose -f docker-compose.yml -f compose.scheduler.yml up --build -d
docker compose logs -f

That's the whole thing for the happy path. Ctrl-C out of the logs once you see migrate exit 0, api reach healthy, and ui reach healthy — -d keeps everything running.

What each piece does:

Step What it actually does
git pull Updates the source tree on the host — Dockerfiles, compose file, app code, migrations, scheduler crontab.
--build Rebuilds local images. Docker layer cache reuses everything that didn't change. App-code-only changes rebuild in seconds; composer.lock changes re-run composer install.
up -d Compares each service's desired image+config against what's running and recreates only the containers whose image hash or config changed.
docker compose logs -f Follows logs from all services. Useful while waiting for migrations to apply.

The dependency chain in docker-compose.yml orders things correctly without any manual sequencing:

  1. migrate runs first (Phinx applies any new migrations, exits 0).
  2. api waits for migrate to complete successfully, then starts.
  3. ui waits for api's /healthz to pass, then starts.
  4. scheduler (overlay) waits for api to be healthy.

1.2 What you don't need to do

These are common confusions — none of them apply to a normal update:

  • docker compose pull — that's for fetching images from a registry. This stack builds locally from ./api and ./ui, so a pull does nothing useful.
  • docker compose restart — kicks the existing container without changing the image. Does not pick up new code. Don't use this for updates.
  • docker compose down then up — works, but unnecessary churn. up --build -d does selective recreation in one step.
  • docker compose down -vdestroys the irdb-data volume and your SQLite database with it. Don't run this casually.
  • sudo docker ... — on Linux, your user should be in the docker group; on macOS / Docker Desktop, sudo isn't needed at all.

1.3 Verifying the new code is live

docker compose ps                   # all should show "Up (healthy)" except migrate ("Exited (0)")
docker compose images               # CREATED column should show your fresh build timestamp
curl -s http://localhost:8081/healthz | jq    # api responds, db reachable, jobs fresh
curl -s http://localhost:8080/healthz         # ui can reach api

If you want to spot-check a specific code change:

docker compose exec api cat /app/src/some/changed/File.php | head -20

(You can't git rev-parse HEAD inside the container — .dockerignore excludes the .git directory from the build context.)

1.4 Rebuild scope: the whole stack vs one service

For most updates, rebuild the whole stack — it's only marginally slower because of layer caching, and it avoids "half-updated" states.

If you really only want one service, e.g., the UI:

docker compose up --build -d ui

But notice: if your change touched api (which ui depends on), ui won't see the new api until api is also recreated. Whole-stack rebuild dodges this.

1.5 Force rebuild (ignore cache)

Rare — usually only when debugging a build that produced obviously stale output despite a code change:

docker compose build --no-cache
docker compose up -d

This re-runs every Dockerfile step from scratch. On the api image that means re-fetching Alpine packages and re-running composer install; expect a few minutes.

1.6 Pulling with the scheduler overlay

If you deploy with the scheduler sidecar (the compose.scheduler.yml overlay), include it in every compose command, otherwise the scheduler container ends up orphaned:

docker compose -f docker-compose.yml -f compose.scheduler.yml up --build -d
docker compose -f docker-compose.yml -f compose.scheduler.yml logs -f scheduler
docker compose -f docker-compose.yml -f compose.scheduler.yml down

Set COMPOSE_FILE=docker-compose.yml:compose.scheduler.yml in your shell to avoid retyping the -f flags:

export COMPOSE_FILE=docker-compose.yml:compose.scheduler.yml
docker compose up --build -d

2. Image and container lifecycle

2.1 Inspecting state

docker compose ps              # services in this project
docker compose images          # which image is each service running
docker compose top             # processes inside each container
docker compose port api 8081   # host port mapping
docker compose config          # effective merged compose file

2.2 Rebuilding a single service

docker compose build api       # rebuild api image without recreating containers
docker compose up -d api       # then recreate the api container

Equivalent shortcut:

docker compose up --build -d api

2.3 Recreating without rebuilding (rare)

When config changed but you don't want a rebuild — e.g., you only edited .env:

docker compose up -d           # detects env diff, recreates affected containers

2.4 Stopping vs killing

docker compose stop            # graceful: SIGTERM, then SIGKILL after grace period
docker compose kill            # immediate SIGKILL — only when stop hangs
docker compose start           # bring stopped containers back up (no recreate, no rebuild)

stop then start keeps the same container instance and writable layer. up -d on a stopped project recreates if the image or config changed since stop.


3. Volume management

3.1 Volumes you should know about

Volume name Mounted at Contents
irdb_irdb-data /data in migrate + api SQLite db (irdb.sqlite), GeoIP MMDBs (geoip/*.mmdb), backups.
irdb_mysql-data /var/lib/mysql in mysql (Optional) MySQL data files. Only when DB_DRIVER=mysql.

The volume name prefix (irdb_) is the Compose project name — it defaults to the directory name, so if you cloned into irdb/ you get irdb_irdb-data. If you cloned somewhere else, run docker volume ls | grep -i irdb-data to find it.

3.2 Inspecting a volume's contents

You can't cd into the volume from the host on macOS (it lives inside the Docker Desktop VM) and on Linux it requires root. Use a one-shot container instead:

docker run --rm -v irdb_irdb-data:/data alpine ls -la /data
docker run --rm -v irdb_irdb-data:/data alpine du -sh /data /data/geoip

3.3 Fixing ownership after a uid change

The api and migrate containers run as uid 1000 (the app user). If a volume was created when the containers ran as root — e.g., from a pre-SEC_REVIEW F18 deployment — the files inside are root-owned and the new uid 1000 process cannot write to them. The symptom is:

PDOException: SQLSTATE[HY000]: General error: 8 attempt to write
a readonly database

Fix without losing data:

docker compose down                                                  # stop, keep volume
docker run --rm -u 0 -v irdb_irdb-data:/data alpine \
    chown -R 1000:1000 /data
docker compose up --build -d

Verify:

docker run --rm -v irdb_irdb-data:/data alpine stat -c '%u' /data    # should print 1000

If the volume's data is disposable (dev / fresh environment), the nuclear option is faster:

docker compose down -v && docker compose up --build -d

3.4 Backing up the volume (whole-tarball)

The SQLite-API-aware backup is documented in README.md § Backups. For a whole-volume tarball — useful before any risky volume operation:

docker compose stop api
docker run --rm -v irdb_irdb-data:/data -v "$(pwd):/backup" alpine \
    tar czf /backup/irdb-volume-$(date +%F).tar.gz -C /data .
docker compose start api

Restore:

docker compose down
docker run --rm -v irdb_irdb-data:/data -v "$(pwd):/backup" alpine \
    sh -c 'rm -rf /data/* /data/.[!.]* && tar xzf /backup/irdb-volume-2026-05-01.tar.gz -C /data'
docker compose up -d

3.5 Removing a volume explicitly

docker compose down                       # stop containers (don't use -v)
docker volume rm irdb_irdb-data           # then remove the volume by name

docker volume rm refuses to remove a volume that's still attached to a container — that's a safety feature, don't fight it. Stop the relevant container first.


4. Troubleshooting

4.1 "attempt to write a readonly database"

Cause: the SQLite file or its directory isn't writable by uid 1000. Most commonly a pre-F18 volume — see § 3.3 for the chown fix.

Other causes worth ruling out:

  • The host filesystem the volume lives on is full → df -h (Linux) or docker system df (any).
  • The host filesystem is read-only (e.g., emergency-mounted /) → mount | grep ' / '.
  • A previous migrate run left a .journal file owned by a different uid — the chown in § 3.3 fixes this too.

4.2 Migration fails

The migrate container exits non-zero, api doesn't start because the service_completed_successfully gate isn't satisfied. Diagnosis:

docker compose logs migrate

Common failures:

  • Schema conflict — a migration tries to add a column that already exists, usually because the migration was edited after being applied. Phinx tracks applied migrations in phinxlog. Rolling back manually:

    docker compose run --rm migrate vendor/bin/phinx rollback \
      --configuration=config/phinx.php --target=<previous_version>
    
  • Permission error — see § 3.3.

  • Constraint violation during data migration — fix the data or the migration; rerun.

After fixing, just docker compose up --build -d again. Phinx resumes from the last successful version (it's idempotent on already- applied migrations).

4.3 Healthcheck never passes

api shows Up (unhealthy) or ui won't start because api isn't healthy. Diagnose:

docker compose exec api wget -qO- http://localhost:8081/healthz
docker compose logs api --tail 100

Likely culprits:

  • Missing or invalid env var (api validates required env on boot — look for the explicit error near the top of the log).
  • migrate did not actually finish before api started — happens if you bypassed compose with manual docker run.
  • Database unreachable (MySQL mode only — check mysql is up + healthy first).

4.4 Container won't start at all

docker compose ps -a            # see exit code
docker compose logs <service>

Common:

  • .env missing — Compose validates env_file: .env and refuses to start. cp .env.example .env and fill in secrets.
  • Port conflict — host port 8080 or 8081 in use by something else. Either stop the other process, or remap in docker-compose.yml.
  • Image build failed — read the build output above the start attempt.

4.5 Migrate looks stuck

Phinx output is buffered through PHP's stdout. If a single migration takes a while (large data backfill), it can look frozen. Check liveness:

docker compose top migrate

If you see a php or phinx process consuming CPU, it's working. If the process is gone but the container is still listed, check exit code with docker compose ps -a.

4.6 Disk fills up

Symptoms: builds fail with "no space left on device", or api crashes with SQLite "disk I/O error".

docker system df                 # what's using the docker disk
docker image prune -f            # remove dangling images (safe)
docker builder prune -f          # remove old build cache (safe)
docker volume ls -f dangling=true   # volumes with no associated container

Don't run docker system prune --volumes casually — it removes unused volumes including ones you might still want. Use the targeted commands above.

For the irdb-data volume specifically, the audit log is the largest growing table. The cleanup-audit job prunes it according to JOB_AUDIT_RETENTION_DAYS (default 180 days). If audit log has run away, lower the retention and run the job once-shot:

docker compose exec api curl -s -X POST http://localhost:8081/internal/jobs/cleanup-audit \
    -H "Authorization: Bearer ${INTERNAL_JOB_TOKEN}"

5. Rolling back

If a deploy breaks things and you need to get back to the prior version:

git log --oneline -10                                  # find the previous good commit
git checkout <prev-commit>
docker compose -f docker-compose.yml -f compose.scheduler.yml up --build -d

Caveat: migrations are not automatically reversed. If the broken deploy added a migration, the schema is still at the new version when you check out the old code. Two paths:

  1. Old code is forward-compatible with the new schema (most additive migrations — new columns, new tables — qualify). Just redeploy old code; it ignores the new shape.
  2. Old code can't run against the new schema (column rename, type change, dropped column). Roll the migration back too:

    docker compose run --rm migrate vendor/bin/phinx rollback \
       --configuration=config/phinx.php --target=<previous_version>
    

Then redeploy old code.

If the broken deploy corrupted data, restore from a backup (README.md § Backups — use the SQLite .backup file or the volume tarball, not a partial state).

After investigating and fixing on main, return to the branch tip:

git checkout main
git pull
docker compose -f docker-compose.yml -f compose.scheduler.yml up --build -d

6. Scheduler operations

The scheduler sidecar (compose.scheduler.yml overlay) runs busybox crond and posts to /internal/jobs/tick once a minute. The endpoint is bound to RFC1918 + loopback only — see api/docker/Caddyfile.

6.1 Verify the scheduler is firing

docker compose logs -f scheduler           # one POST per minute
docker compose exec api curl -s http://localhost:8081/healthz | jq .jobs

The jobs block in /healthz shows the most-recent successful tick; if it's > 5 minutes stale and the scheduler is "running", investigate.

6.2 Force a job run

Each job has its own endpoint under /internal/jobs/:

docker compose exec api curl -s -X POST \
    http://localhost:8081/internal/jobs/<job-name> \
    -H "Authorization: Bearer ${INTERNAL_JOB_TOKEN}"

Available jobs: recompute-scores, cleanup-audit, cleanup-expired-manual-blocks, enrich-pending, refresh-geoip, tick (dispatcher). GET /internal/jobs/status returns the latest run record per job.

Admins logged into the UI can also trigger a job from the Settings → Jobs screen, which posts to /api/v1/admin/jobs/trigger/{name} — that path uses an admin token, not INTERNAL_JOB_TOKEN.

6.3 Switching scheduler styles

Three options — pick exactly one to avoid double-firing:

Style When to use How
Sidecar Default. Self-contained, no host setup. Include compose.scheduler.yml overlay.
Host cron You already manage cron centrally (e.g., Ansible). Drop examples/scheduler/host.crontab into /etc/cron.d/. Don't include the overlay.
systemd timer Modern Linux without crond, you want timer accuracy + journal logging. Install examples/scheduler/irdb-tick.{service,timer} into /etc/systemd/system. Don't include the overlay.

If you migrate from sidecar to host cron / systemd: stop the sidecar (docker compose stop scheduler && docker compose rm -f scheduler, and stop including compose.scheduler.yml on subsequent up invocations), enable the host driver, and verify exactly one tick per minute lands by tailing api logs.


7. Multi-host and scaling notes

The default deployment is single-host. A few caveats if you scale:

  • SQLite mode is single-host only. Move to MySQL (see README.md § MySQL) before adding a second api replica.
  • The migrate container must run exactly once per deploy, not once per replica. If you orchestrate manually, gate api startup on a single-shot migrate completion.
  • The UI session store is local file-backed (/tmp in the ui container). Multiple ui replicas need either sticky sessions at the load balancer, or a shared session store (Redis) — currently not configured out of the box.
  • The scheduler must run exactly once globally, not per host. If you run host cron on every node, you'll multi-fire jobs. Pick one node.

8. Security update workflow

The base images (dunglas/frankenphp:1-php8.3-alpine for api/ui, alpine:3.21@sha256:… digest-pinned for the scheduler) are referenced in their Dockerfiles. To pull security fixes:

docker compose build --pull         # forces re-pull of base images
docker compose up -d

--pull is the difference: without it, the build reuses the locally- cached base image even when the registry has a newer tag.

For app-level security updates (Composer dependencies):

cd api
composer update --no-dev            # updates composer.lock on the host
cd ..
docker compose up --build -d        # rebuild picks up the new lock file

Review the changelog of any updated package before deploying. The doc/SEC_REVIEW.md document tracks security findings + their fixes per commit; git log --grep SEC_REVIEW surfaces the pattern.


9. Operational quick reference

Daily operations:

docker compose ps                                # are all services up?
docker compose logs -f api ui                    # tail app logs
docker compose exec api sh                       # shell into api (read-only rootfs; /tmp is writable)

Deploys:

git pull && docker compose -f docker-compose.yml -f compose.scheduler.yml up --build -d

Backups (SQLite, online):

docker compose exec api sh -c \
    'sqlite3 /data/irdb.sqlite ".backup /data/irdb-backup.sqlite"'
docker compose cp api:/data/irdb-backup.sqlite ./irdb-backup-$(date +%F).sqlite

Health:

curl -s http://localhost:8081/healthz | jq
curl -s http://localhost:8080/healthz

Disk:

docker system df
docker image prune -f && docker builder prune -f

Volume rescue:

docker run --rm -u 0 -v irdb_irdb-data:/data alpine chown -R 1000:1000 /data

10. See also