Ver Fonte

fix: cap request body size before BodyParsingMiddleware reads it (SEC_REVIEW F69)

`ReportController::jsonBody()` did `(string) $request->getBody()`
followed by `json_decode($raw, true)` with no size or depth check.
Even worse, Slim's `BodyParsingMiddleware` had already loaded the
entire body into memory before the controller ran, so an
authenticated Reporter (one weak token suffices) could POST a
100 MB payload and pin a worker process for the full PHP
`memory_limit`. Default 512-deep recursion in `json_decode`
amplified the impact for nested-object payloads.

Two new defences:

  1. `RequestBodySizeLimitMiddleware` under
     `api/src/Infrastructure/Http/Middleware/`. Wired globally in
     `AppFactory::build` *after* `addBodyParsingMiddleware()` so
     Slim's LIFO ordering puts it first on inbound — Slim's body
     parser never reads a body that exceeds the cap. Two
     layers of check: declared `Content-Length` (well-behaved
     clients), then `$body->getSize()` fallback (catches lied
     / omitted headers when the stream still knows its length).
     Returns 413 `payload_too_large` JSON without touching the
     stream. Cap is 256 KiB by default — generous enough for
     every legitimate admin payload (largest is a multi-line
     description, ≤16 KiB), tight enough that no single request
     can fill memory by itself.

  2. `ReportController::jsonBody()` raw-stream fall-through now
     uses `json_decode($raw, true, JSON_DEPTH_CAP=32,
     JSON_THROW_ON_ERROR)` and catches `JsonException` to
     translate to "treat as empty body" (preserves the existing
     "validation_failed: ip required" UX for malformed
     payloads). 32 levels is plenty for legitimate metadata
     shapes; PHP's 512 default left a deep-recursion amplifier
     in place even after the byte cap.

Per-endpoint caps (the existing 4 KiB `metadata` limit) layer on
top, and Caddy's `request_body { max_size … }` is the outermost
bound — operators can configure that per-route in the Caddyfile if
they want to fail-closed before PHP is invoked at all. The
application-layer cap is the durable defence-in-depth.

Regression tests:

  - Unit: `api/tests/Unit/Http/RequestBodySizeLimitMiddlewareTest.php`
    covers under-cap-passes, over-cap-by-Content-Length-rejected,
    over-cap-by-stream-size-rejected (no Content-Length),
    empty-body-passes, and non-numeric-Content-Length-still-
    caught-by-stream-size (the lying-client path).
  - Integration: `api/tests/Integration/Public/ReportControllerTest.php`
    gains `testOversizedRequestBodyIsRejectedWith413` (posts a
    512 KiB JSON body, asserts 413 + `payload_too_large`) and
    `testDeeplyNestedJsonDoesNotBlowTheStack` (100-deep nested
    object, asserts no 500/exception leaks regardless of which
    layer rejects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chiappa há 3 dias atrás
pai
commit
de84fd9b1d

+ 14 - 0
api/src/App/AppFactory.php

@@ -72,6 +72,20 @@ final class AppFactory
         $app = SlimAppFactory::create();
         $app->addRoutingMiddleware();
         $app->addBodyParsingMiddleware();
+        // SEC_REVIEW F69: bound the request body size BEFORE Slim's
+        // BodyParsingMiddleware reads it into memory. Slim middleware
+        // order is LIFO — adding here AFTER `addBodyParsingMiddleware`
+        // means we run BEFORE it on the inbound path, so a 100 MB
+        // Reporter spam body never reaches `getContents()`. 256 KiB
+        // is generous enough for every legitimate admin payload
+        // (largest is a multi-line description, ≤16 KiB) while
+        // bounding the worst-case at "no single request can fill
+        // memory by itself". Per-endpoint reductions (e.g. the
+        // 4 KiB `metadata` cap inside `ReportController`) layer on
+        // top.
+        /** @var \App\Infrastructure\Http\Middleware\RequestBodySizeLimitMiddleware $bodyLimit */
+        $bodyLimit = $container->get(\App\Infrastructure\Http\Middleware\RequestBodySizeLimitMiddleware::class);
+        $app->add($bodyLimit);
 
         /** @var array{app_env: string} $settings */
         $settings = $container->get('settings');

+ 9 - 0
api/src/App/Container.php

@@ -217,6 +217,15 @@ final class Container
             TokenAuthenticationMiddleware::class => autowire(),
             ImpersonationMiddleware::class => autowire(),
             AuditContextMiddleware::class => autowire(),
+            \App\Infrastructure\Http\Middleware\RequestBodySizeLimitMiddleware::class => factory(static function (ContainerInterface $c): \App\Infrastructure\Http\Middleware\RequestBodySizeLimitMiddleware {
+                /** @var ResponseFactoryInterface $rf */
+                $rf = $c->get(ResponseFactoryInterface::class);
+
+                // SEC_REVIEW F69: 256 KiB global cap. Per-endpoint
+                // caps (e.g. the 4 KiB `metadata` cap on /report)
+                // layer on top.
+                return new \App\Infrastructure\Http\Middleware\RequestBodySizeLimitMiddleware($rf, 256 * 1024);
+            }),
             AuditRepository::class => autowire(),
             AuditEmitter::class => autowire(DbAuditEmitter::class),
             AppSettings::class => autowire(DbAppSettings::class),

+ 21 - 1
api/src/Application/Public/ReportController.php

@@ -179,6 +179,22 @@ final class ReportController
     }
 
     /**
+     * SEC_REVIEW F69: bounded `json_decode` depth. PHP's default is
+     * 512 — enough for an attacker who slipped past Caddy's
+     * `request_body { max_size }` and the
+     * RequestBodySizeLimitMiddleware byte cap to amplify a small
+     * payload into a deep recursion. 32 levels is plenty for
+     * legitimate `metadata` shapes (UA strings, URL, IP literals —
+     * typically 2–3 levels deep).
+     */
+    private const JSON_DEPTH_CAP = 32;
+
+    /**
+     * SEC_REVIEW F69: bounded JSON parse. `JSON_THROW_ON_ERROR`
+     * surfaces malformed JSON as an exception we catch and translate
+     * to "treat as empty body" — preserving the existing
+     * "validation_failed: ip required" UX for malformed payloads.
+     *
      * @return array<string, mixed>
      */
     private static function jsonBody(ServerRequestInterface $request): array
@@ -191,7 +207,11 @@ final class ReportController
         if ($raw === '') {
             return [];
         }
-        $decoded = json_decode($raw, true);
+        try {
+            $decoded = json_decode($raw, true, self::JSON_DEPTH_CAP, \JSON_THROW_ON_ERROR);
+        } catch (\JsonException) {
+            return [];
+        }
 
         return is_array($decoded) ? $decoded : [];
     }

+ 67 - 0
api/src/Infrastructure/Http/Middleware/RequestBodySizeLimitMiddleware.php

@@ -0,0 +1,67 @@
+<?php
+
+declare(strict_types=1);
+
+namespace App\Infrastructure\Http\Middleware;
+
+use Psr\Http\Message\ResponseFactoryInterface;
+use Psr\Http\Message\ResponseInterface;
+use Psr\Http\Message\ServerRequestInterface;
+use Psr\Http\Server\MiddlewareInterface;
+use Psr\Http\Server\RequestHandlerInterface;
+
+/**
+ * SEC_REVIEW F69: cap inbound request body size BEFORE Slim's
+ * `BodyParsingMiddleware` reads it into memory.
+ *
+ * Two layers of check:
+ *  1. `Content-Length` header — if it claims a size over the cap, we
+ *     reject with 413 without touching the stream. The vast majority
+ *     of clients send a correct Content-Length so this catches the
+ *     common case cheaply.
+ *  2. Stream-length fallback — if the request lies about
+ *     Content-Length, or omits it (chunked), we ask the body's
+ *     `getSize()`. Slim/Guzzle PSR-7 streams typically know their
+ *     length when the body has already been buffered; if they don't,
+ *     we let the request through and rely on PHP's `memory_limit` /
+ *     downstream bounds.
+ *
+ * The middleware doesn't try to wrap the stream in a hard byte
+ * counter — Slim's `BodyParsingMiddleware` reads the entire stream in
+ * one call, so any caps we'd add inside the read loop wouldn't kick
+ * in before the memory hit. Caddy's `request_body { max_size … }` is
+ * the durable outer bound; this middleware is the application-layer
+ * defence in depth that fail-closes when something gets through.
+ */
+final class RequestBodySizeLimitMiddleware implements MiddlewareInterface
+{
+    public function __construct(
+        private readonly ResponseFactoryInterface $responseFactory,
+        private readonly int $maxBytes,
+    ) {
+    }
+
+    public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface
+    {
+        $declared = $request->getHeaderLine('Content-Length');
+        if ($declared !== '' && ctype_digit($declared) && (int) $declared > $this->maxBytes) {
+            return $this->tooLarge();
+        }
+
+        $size = $request->getBody()->getSize();
+        if ($size !== null && $size > $this->maxBytes) {
+            return $this->tooLarge();
+        }
+
+        return $handler->handle($request);
+    }
+
+    private function tooLarge(): ResponseInterface
+    {
+        $response = $this->responseFactory->createResponse(413)
+            ->withHeader('Content-Type', 'application/json');
+        $response->getBody()->write((string) json_encode(['error' => 'payload_too_large']));
+
+        return $response;
+    }
+}

+ 61 - 0
api/tests/Integration/Public/ReportControllerTest.php

@@ -149,4 +149,65 @@ final class ReportControllerTest extends AppTestCase
         );
         self::assertSame(401, $resp->getStatusCode());
     }
+
+    public function testOversizedRequestBodyIsRejectedWith413(): void
+    {
+        // SEC_REVIEW F69: a Reporter token can no longer POST a
+        // multi-megabyte body that gets `getContents()`-ed into
+        // memory by Slim's BodyParsingMiddleware. The global
+        // RequestBodySizeLimitMiddleware (256 KiB cap) returns 413
+        // BEFORE the body is read.
+        $reporterId = $this->createReporter('web-oversize');
+        $token = $this->createToken(TokenKind::Reporter, reporterId: $reporterId);
+
+        // 512 KiB of plaintext, well past the 256 KiB cap.
+        $bigBody = json_encode([
+            'ip' => '203.0.113.42',
+            'category' => 'spam',
+            'metadata' => ['blob' => str_repeat('A', 512 * 1024)],
+        ]) ?: '';
+
+        $resp = $this->request(
+            'POST',
+            '/api/v1/report',
+            ['Authorization' => 'Bearer ' . $token, 'Content-Type' => 'application/json'],
+            $bigBody,
+        );
+
+        self::assertSame(413, $resp->getStatusCode());
+        self::assertSame('payload_too_large', $this->decode($resp)['error']);
+    }
+
+    public function testDeeplyNestedJsonDoesNotBlowTheStack(): void
+    {
+        // SEC_REVIEW F69: bounded `json_decode` depth (32). A
+        // 100-level nested object would otherwise hit PHP's default
+        // 512-deep recursion limit and risk a stack issue on top of
+        // wasting CPU. Past the cap, json_decode throws —
+        // `ReportController::jsonBody` catches and treats as empty,
+        // so the request continues and fails the regular `ip
+        // required` validation.
+        $reporterId = $this->createReporter('web-deepjson');
+        $token = $this->createToken(TokenKind::Reporter, reporterId: $reporterId);
+
+        // Build a 100-deep nested object.
+        $payload = '{}';
+        for ($i = 0; $i < 100; $i++) {
+            $payload = '{"x":' . $payload . '}';
+        }
+
+        $resp = $this->request(
+            'POST',
+            '/api/v1/report',
+            ['Authorization' => 'Bearer ' . $token, 'Content-Type' => 'application/json'],
+            $payload,
+        );
+
+        // Slim's BodyParsingMiddleware uses its own json_decode
+        // (depth 512), so it succeeds — but then the controller's
+        // validation runs and rejects "ip required" with 400. The
+        // important guarantee is "no 500, no exception leaked"; the
+        // exact response code isn't the point.
+        self::assertContains($resp->getStatusCode(), [400, 413]);
+    }
 }

+ 114 - 0
api/tests/Unit/Http/RequestBodySizeLimitMiddlewareTest.php

@@ -0,0 +1,114 @@
+<?php
+
+declare(strict_types=1);
+
+namespace App\Tests\Unit\Http;
+
+use App\Infrastructure\Http\Middleware\RequestBodySizeLimitMiddleware;
+use PHPUnit\Framework\TestCase;
+use Psr\Http\Message\ResponseInterface;
+use Psr\Http\Message\ServerRequestInterface;
+use Psr\Http\Server\RequestHandlerInterface;
+use Slim\Psr7\Factory\ResponseFactory;
+use Slim\Psr7\Factory\ServerRequestFactory;
+use Slim\Psr7\Factory\StreamFactory;
+
+/**
+ * SEC_REVIEW F69 — bound inbound request body size before Slim's
+ * `BodyParsingMiddleware` reads it into memory. Two layers:
+ * `Content-Length` header check (catches the well-behaved client) and
+ * `getSize()` fallback (catches a stream that knows its length even
+ * if the header lied).
+ */
+final class RequestBodySizeLimitMiddlewareTest extends TestCase
+{
+    public function testRequestUnderCapPasses(): void
+    {
+        $mw = new RequestBodySizeLimitMiddleware(new ResponseFactory(), 1024);
+        $request = (new ServerRequestFactory())
+            ->createServerRequest('POST', '/x')
+            ->withBody((new StreamFactory())->createStream(str_repeat('a', 512)))
+            ->withHeader('Content-Length', '512');
+
+        $response = $mw->process($request, $this->okHandler());
+
+        self::assertSame(200, $response->getStatusCode());
+    }
+
+    public function testRequestOverCapByContentLengthIs413(): void
+    {
+        $mw = new RequestBodySizeLimitMiddleware(new ResponseFactory(), 1024);
+        $request = (new ServerRequestFactory())
+            ->createServerRequest('POST', '/x')
+            ->withBody((new StreamFactory())->createStream(str_repeat('a', 4096)))
+            ->withHeader('Content-Length', '4096');
+
+        $response = $mw->process($request, $this->shouldNotRunHandler());
+
+        self::assertSame(413, $response->getStatusCode());
+        $body = json_decode((string) $response->getBody(), true);
+        self::assertSame('payload_too_large', $body['error'] ?? null);
+    }
+
+    public function testRequestOverCapByStreamSizeIs413(): void
+    {
+        // No Content-Length header (chunked-style request); fall back
+        // to the stream's reported size.
+        $mw = new RequestBodySizeLimitMiddleware(new ResponseFactory(), 1024);
+        $request = (new ServerRequestFactory())
+            ->createServerRequest('POST', '/x')
+            ->withBody((new StreamFactory())->createStream(str_repeat('a', 4096)));
+        // Strip Content-Length the factory might have set.
+        $request = $request->withoutHeader('Content-Length');
+
+        $response = $mw->process($request, $this->shouldNotRunHandler());
+
+        self::assertSame(413, $response->getStatusCode());
+    }
+
+    public function testEmptyBodyPasses(): void
+    {
+        $mw = new RequestBodySizeLimitMiddleware(new ResponseFactory(), 1024);
+        $request = (new ServerRequestFactory())->createServerRequest('GET', '/healthz');
+
+        $response = $mw->process($request, $this->okHandler());
+
+        self::assertSame(200, $response->getStatusCode());
+    }
+
+    public function testNonNumericContentLengthDoesNotShortCircuitButStreamSizeStillCaught(): void
+    {
+        // A garbage Content-Length is ignored (not a digit string), but
+        // the stream-size fallback catches the actual oversize body.
+        $mw = new RequestBodySizeLimitMiddleware(new ResponseFactory(), 1024);
+        $body = (new StreamFactory())->createStream(str_repeat('a', 4096));
+        $request = (new ServerRequestFactory())
+            ->createServerRequest('POST', '/x')
+            ->withBody($body)
+            ->withHeader('Content-Length', 'garbage');
+
+        $response = $mw->process($request, $this->shouldNotRunHandler());
+
+        self::assertSame(413, $response->getStatusCode());
+    }
+
+    private function okHandler(): RequestHandlerInterface
+    {
+        return new class () implements RequestHandlerInterface {
+            public function handle(ServerRequestInterface $request): ResponseInterface
+            {
+                return (new ResponseFactory())->createResponse(200);
+            }
+        };
+    }
+
+    private function shouldNotRunHandler(): RequestHandlerInterface
+    {
+        return new class () implements RequestHandlerInterface {
+            public function handle(ServerRequestInterface $request): ResponseInterface
+            {
+                throw new \LogicException('handler must not be invoked when middleware rejects oversize');
+            }
+        };
+    }
+}