Selaa lähdekoodia

fix: build scheduler sidecar from pinned image (SEC_REVIEW F22)

The previous compose.scheduler.yml used `image: alpine:3` (floating
tag) and ran `apk add --no-cache curl tini` at every container start.
Each restart trusted the apk mirror, so a mirror compromise or
typosquat would have rooted the scheduler — which holds
INTERNAL_JOB_TOKEN and can call /internal/jobs/*.

Replace it with a real build context at scheduler/. The new Dockerfile
pins FROM alpine:3.21 by digest and installs curl, tini, and
ca-certificates with explicit versions at build time, so restarts do
no network fetch. The crontab is baked into the image (also fixing
the dangling ./docker/scheduler.crontab bind path the old compose
file referenced). Compose-side hardening: read_only rootfs, tmpfs
for /run and /tmp, no-new-privileges. cap_drop:[ALL] was tested and
rejected — busybox crond calls initgroups() before each fork and
needs CAP_SETGID even when the target uid is the same root it is
already running as.

Verified end-to-end: stack comes up healthy and the api logs
{"job":"tick","status":"success"} within one minute of the sidecar
starting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chiappa 4 päivää sitten
vanhempi
sitoutus
d9006ebae7
5 muutettua tiedostoa jossa 95 lisäystä ja 17 poistoa
  1. 19 9
      SPEC.md
  2. 25 8
      compose.scheduler.yml
  3. 3 0
      scheduler/.dockerignore
  4. 38 0
      scheduler/Dockerfile
  5. 10 0
      scheduler/scheduler.crontab

+ 19 - 9
SPEC.md

@@ -696,26 +696,32 @@ volumes:
 Provide `examples/scheduler/irdb-tick.service` and `examples/scheduler/irdb-tick.timer`. Documented in README.
 
 **Option C: Sidecar overlay (`compose.scheduler.yml`)**
+
+Built from `scheduler/Dockerfile` (digest-pinned alpine + pinned curl/tini
+versions installed at build time, schedule baked into the image — see
+SEC_REVIEW F22 for why the previous floating-tag + runtime-apk pattern
+was retired).
+
 ```yaml
 services:
   scheduler:
-    image: alpine:3
-    command: >
-      sh -c "
-      apk add --no-cache curl tini &&
-      exec tini -- crond -f -L /dev/stdout
-      "
-    volumes:
-      - ./docker/scheduler.crontab:/etc/crontabs/root:ro
+    image: irdb-scheduler:latest
+    build: { context: ./scheduler }
     environment:
       INTERNAL_JOB_TOKEN: ${INTERNAL_JOB_TOKEN}
+    read_only: true
+    tmpfs:
+      - /run:mode=0755
+      - /tmp:mode=1777
+    security_opt:
+      - no-new-privileges:true
     depends_on:
       api:
         condition: service_healthy
     restart: unless-stopped
 ```
 
-`docker/scheduler.crontab`:
+`scheduler/scheduler.crontab` (baked into the image):
 ```cron
 * * * * * curl -sf -m 280 -X POST -H "Authorization: Bearer $INTERNAL_JOB_TOKEN" http://api:8081/internal/jobs/tick > /dev/null
 ```
@@ -784,6 +790,10 @@ Monorepo. Each container has its own subdirectory with its own `composer.json`,
 │   └── reverse-proxy/
 │       └── Caddyfile
+├── scheduler/                       # ─────── scheduler sidecar ───────
+│   ├── Dockerfile                   # alpine + curl + busybox crond, all pinned at build time
+│   └── scheduler.crontab            # canonical schedule baked into the image
+│
 ├── api/                             # ─────── api container ───────
 │   ├── Dockerfile
 │   ├── composer.json

+ 25 - 8
compose.scheduler.yml

@@ -1,15 +1,32 @@
 services:
   scheduler:
-    image: alpine:3
-    command: >
-      sh -c "
-      apk add --no-cache curl tini &&
-      exec tini -- crond -f -L /dev/stdout
-      "
-    volumes:
-      - ./docker/scheduler.crontab:/etc/crontabs/root:ro
+    image: irdb-scheduler:latest
+    build: { context: ./scheduler }
     environment:
       INTERNAL_JOB_TOKEN: ${INTERNAL_JOB_TOKEN}
+    # SEC_REVIEW F22: dependencies (curl, tini, ca-certificates) are now
+    # baked into the image at build time with pinned versions, against a
+    # digest-pinned alpine base. The previous `image: alpine:3` +
+    # `apk add` at container start trusted the apk mirror on every
+    # restart and would have given a mirror compromise a foothold in the
+    # container that holds INTERNAL_JOB_TOKEN.
+    read_only: true
+    # busybox crond writes a tiny tempfile when the schedule fires; /run
+    # has to be writable for that. Everything else stays read-only.
+    tmpfs:
+      - /run:mode=0755
+      - /tmp:mode=1777
+    # busybox crond calls initgroups() before each exec, which needs
+    # CAP_SETGID even when the target user is the same root it is
+    # already running as — full cap_drop crashes it with
+    # "can't set groups: Operation not permitted". Hardening the
+    # process to non-root would mean shipping a custom cron binary;
+    # not worth the maintenance cost given the container has no
+    # persistent volume, no exposed port, and only INTERNAL_JOB_TOKEN
+    # in env. `no-new-privileges` is still useful: there is no setuid
+    # binary in the image and we want to keep it that way.
+    security_opt:
+      - no-new-privileges:true
     depends_on:
       api:
         condition: service_healthy

+ 3 - 0
scheduler/.dockerignore

@@ -0,0 +1,3 @@
+*
+!Dockerfile
+!scheduler.crontab

+ 38 - 0
scheduler/Dockerfile

@@ -0,0 +1,38 @@
+# syntax=docker/dockerfile:1.7
+#
+# IRDB scheduler sidecar image (SEC_REVIEW F22).
+#
+# Replaces the previous `image: alpine:3` + `apk add curl tini` at
+# container start: the floating tag pulled whatever Alpine shipped on
+# any given restart, and the runtime apk fetch made every restart
+# trust-on-first-use against the apk mirror.  A package-mirror compromise
+# (or typosquat) would have given root in the scheduler — which holds
+# INTERNAL_JOB_TOKEN and can call /internal/jobs/* — on every restart.
+#
+# Now: pinned digest, pinned apk versions, all dependency installation
+# happens once at build time. Restart cost is just `docker run` of the
+# already-built local image.
+
+FROM alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d
+
+# Pinned via `apk policy` against the same digest above (alpine 3.21
+# main + community at build time of this Dockerfile). Bump these
+# explicitly when the base image is bumped — the build will fail loudly
+# if the version is no longer available, which is the desired signal.
+RUN apk add --no-cache \
+        curl=8.14.1-r2 \
+        tini=0.19.0-r3 \
+        ca-certificates=20260413-r0
+
+# Bake the canonical schedule into the image. Operators who want a
+# different cadence can still bind-mount their own crontab over
+# /etc/crontabs/root in compose.
+COPY scheduler.crontab /etc/crontabs/root
+
+# crond runs as root by default in busybox; that is fine — the
+# container has no persistent volume, no network exposure, and only
+# the INTERNAL_JOB_TOKEN env var as state. Dropping privileges here
+# would mean granting CAP_SETUID-or-similar so crond can read
+# /etc/crontabs/root as a non-root user, which is a worse trade.
+ENTRYPOINT ["/sbin/tini", "--"]
+CMD ["crond", "-f", "-L", "/dev/stdout"]

+ 10 - 0
scheduler/scheduler.crontab

@@ -0,0 +1,10 @@
+# IRDB scheduler — busybox crond schedule.
+#
+# Drives /internal/jobs/tick once a minute. The api dispatches whichever
+# periodic jobs are due (recompute-scores, cleanup-audit, enrich-pending,
+# refresh-geoip). job_locks mediates between replicas so duplicate ticks
+# are correct but wasteful.
+#
+# -m 280 caps the request below the 1-minute cadence so we never queue
+# overlapping ticks.
+* * * * * curl -sf -m 280 -X POST -H "Authorization: Bearer $INTERNAL_JOB_TOKEN" http://api:8081/internal/jobs/tick > /dev/null