ソースを参照

fix: enforce uncompressed-size cap on MaxMind tarball extract (SEC_REVIEW F48)

`MaxMindDownloader::fetchEdition` called `PharData::extractTo` with
`overwrite=true` and no size cap. PHP 8's PharData rejects `..`
traversal, but a small `.tar.gz` whose entries declare multi-GB
uncompressed sizes (or whose contents stream out beyond the
filesystem's free space) could exhaust `/data` before MmdbVerifier
saw the file. Real GeoLite2 country/ASN MMDBs are ~6–7 MiB; anything
into the GiB range is a zip-bomb signal.

Add `MaxMindDownloader::assertSizeBudget` and call it BEFORE
`extractTo`. The helper walks the PharData via `RecursiveIteratorIterator`
(so it descends into the date-stamped `GeoLite2-Country_<date>/`
subdirectory the real tarball nests its files in), reads each entry's
uncompressed size via `SplFileInfo::getSize()`, and throws
`DownloaderException` if any single entry exceeds `MAX_ENTRY_BYTES`
(200 MiB) or the running total exceeds `MAX_TOTAL_BYTES` (400 MiB).
Generous against future MaxMind growth, tight against bomb cases.

The helper is `public static` with `@internal` so the unit test can
drive it directly with small caps without having to build a real
>200 MiB tarball; the production call site at line 91 uses the
defaults. Caps are also `public const` so they're discoverable
from outside if a future test or admin endpoint needs them.

The fetchEdition catch chain is restructured to let
`DownloaderException` propagate as-is (otherwise the new explicit
size-budget message would be wrapped in a generic "extract failed"
envelope and lose the cap details).

Regression tests in `api/tests/Unit/Enrichment/MaxMindDownloaderTest.php`:

  - `testNormalArchivePasses` — small two-file fixture passes with
    default caps.
  - `testEntryOverPerEntryCapIsRejected` — 4 KiB entry rejected
    under a 1 KiB per-entry cap; message includes the offending
    size.
  - `testTotalOverArchiveCapIsRejected` — three 1 KiB entries each
    individually fit a 4 KiB per-entry cap but the running total
    breaches a 2 KiB archive cap on entry 3.
  - `testNestedEntriesAreCounted` — recursive iteration walks into
    the `GeoLite2-Country_20260427/` subdirectory the real tarball
    nests, so a bomb hidden one level deep still trips the cap.

Test fixtures construct PharData via `addFromString` then
unset/re-open the handle (PharData iteration on the write handle
returns nothing until close), matching how production builds the
PharData from an already-on-disk tarball.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chiappa 4 日 前
コミット
c380d126e9

+ 72 - 0
api/src/Infrastructure/Enrichment/Downloaders/MaxMindDownloader.php

@@ -26,6 +26,22 @@ final class MaxMindDownloader implements GeoIpDownloader
 {
     private const ENDPOINT = 'https://download.maxmind.com/app/geoip_download';
 
+    /**
+     * SEC_REVIEW F48: per-entry uncompressed-size cap. Real GeoLite2
+     * country/ASN MMDBs are ~6–7 MiB; 200 MiB is generous headroom
+     * against future growth while bounding the worst case at "no
+     * single entry can fill a small disk by itself".
+     */
+    private const MAX_ENTRY_BYTES = 200 * 1024 * 1024;
+
+    /**
+     * Per-archive cap on the sum of every entry's uncompressed size.
+     * The legitimate tarball is two MMDBs plus a couple of small text
+     * files (COPYRIGHT, LICENSE); 400 MiB sits well above that and
+     * well below "this could exhaust /data".
+     */
+    private const MAX_TOTAL_BYTES = 400 * 1024 * 1024;
+
     public function __construct(
         private readonly ClientInterface $http,
         private readonly string $licenseKey,
@@ -88,7 +104,17 @@ final class MaxMindDownloader implements GeoIpDownloader
         }
         try {
             $phar = new PharData($tarPath);
+            // SEC_REVIEW F48: walk the archive before extracting and
+            // enforce a per-entry + total uncompressed-size cap. A
+            // small .tar.gz that decompresses into multi-GB would
+            // otherwise exhaust disk before MmdbVerifier could see
+            // the file. PharData entries' `getSize()` reports the
+            // uncompressed size, so we can do the check without
+            // touching the filesystem.
+            self::assertSizeBudget($phar, $kind);
             $phar->extractTo($extractDir, null, true);
+        } catch (DownloaderException $e) {
+            throw $e;
         } catch (Throwable $e) {
             throw new DownloaderException(
                 sprintf('maxmind %s extract failed: %s', $kind, $e->getMessage()),
@@ -133,4 +159,50 @@ final class MaxMindDownloader implements GeoIpDownloader
 
         return null;
     }
+
+    /**
+     * SEC_REVIEW F48: zip-bomb guard for the MaxMind tarball. Walks
+     * every entry recursively, summing uncompressed sizes, and
+     * throws if any single entry exceeds `$maxEntryBytes` or the
+     * total exceeds `$maxTotalBytes`. The check runs *before*
+     * `extractTo`, so an attacker who serves a bomb tarball never
+     * gets a single decompressed byte on disk.
+     *
+     * Public-but-`@internal` so tests can drive it with small caps
+     * without having to build a real >200 MiB tarball.
+     *
+     * @internal
+     */
+    public static function assertSizeBudget(
+        PharData $phar,
+        string $kind,
+        int $maxEntryBytes = self::MAX_ENTRY_BYTES,
+        int $maxTotalBytes = self::MAX_TOTAL_BYTES,
+    ): void {
+        $total = 0;
+        $it = new RecursiveIteratorIterator($phar);
+        foreach ($it as $entry) {
+            if (!$entry->isFile()) {
+                continue;
+            }
+            $size = $entry->getSize();
+            if ($size > $maxEntryBytes) {
+                throw new DownloaderException(sprintf(
+                    'maxmind %s entry %s uncompressed size %d > cap %d',
+                    $kind,
+                    $entry->getFilename(),
+                    $size,
+                    $maxEntryBytes,
+                ));
+            }
+            $total += $size;
+            if ($total > $maxTotalBytes) {
+                throw new DownloaderException(sprintf(
+                    'maxmind %s archive uncompressed size > cap %d',
+                    $kind,
+                    $maxTotalBytes,
+                ));
+            }
+        }
+    }
 }

+ 121 - 0
api/tests/Unit/Enrichment/MaxMindDownloaderTest.php

@@ -0,0 +1,121 @@
+<?php
+
+declare(strict_types=1);
+
+namespace App\Tests\Unit\Enrichment;
+
+use App\Infrastructure\Enrichment\Downloaders\DownloaderException;
+use App\Infrastructure\Enrichment\Downloaders\MaxMindDownloader;
+use PharData;
+use PHPUnit\Framework\TestCase;
+
+/**
+ * SEC_REVIEW F48 — `MaxMindDownloader::assertSizeBudget` walks the
+ * tarball before extraction and rejects per-entry / total-size limit
+ * breaches so a small `.tar.gz` cannot decompress into multi-GB and
+ * exhaust disk before MmdbVerifier sees the file.
+ */
+final class MaxMindDownloaderTest extends TestCase
+{
+    private string $tmpDir = '';
+
+    protected function setUp(): void
+    {
+        $this->tmpDir = sys_get_temp_dir() . '/irdb-maxmind-' . bin2hex(random_bytes(6));
+        mkdir($this->tmpDir);
+    }
+
+    protected function tearDown(): void
+    {
+        // Clean up the tar file and any generated companions.
+        if ($this->tmpDir !== '' && is_dir($this->tmpDir)) {
+            foreach (glob($this->tmpDir . '/*') ?: [] as $f) {
+                @unlink($f);
+            }
+            @rmdir($this->tmpDir);
+        }
+    }
+
+    public function testNormalArchivePasses(): void
+    {
+        $phar = $this->buildTar([
+            'GeoLite2-Country.mmdb' => 'pretend-mmdb-bytes',
+            'COPYRIGHT.txt' => 'CC BY-SA 4.0',
+        ]);
+
+        // Default caps (200 MiB / 400 MiB) — small fixtures pass cleanly.
+        MaxMindDownloader::assertSizeBudget($phar, 'country');
+        $this->expectNotToPerformAssertions();
+    }
+
+    public function testEntryOverPerEntryCapIsRejected(): void
+    {
+        $phar = $this->buildTar([
+            'big.bin' => str_repeat('A', 4096),
+        ]);
+
+        $this->expectException(DownloaderException::class);
+        $this->expectExceptionMessage('uncompressed size 4096 > cap 1024');
+        // Force the per-entry cap below the entry size.
+        MaxMindDownloader::assertSizeBudget($phar, 'country', maxEntryBytes: 1024);
+    }
+
+    public function testTotalOverArchiveCapIsRejected(): void
+    {
+        // Three 1 KiB entries individually pass per-entry; the total
+        // breaches the archive cap at the third entry.
+        $phar = $this->buildTar([
+            'a.bin' => str_repeat('a', 1024),
+            'b.bin' => str_repeat('b', 1024),
+            'c.bin' => str_repeat('c', 1024),
+        ]);
+
+        $this->expectException(DownloaderException::class);
+        $this->expectExceptionMessage('archive uncompressed size > cap 2048');
+        MaxMindDownloader::assertSizeBudget(
+            $phar,
+            'country',
+            maxEntryBytes: 4096,
+            maxTotalBytes: 2048,
+        );
+    }
+
+    public function testNestedEntriesAreCounted(): void
+    {
+        // The real MaxMind tarball nests one directory like
+        // `GeoLite2-Country_20260427/GeoLite2-Country.mmdb`. Recursive
+        // iteration must walk into the directory and count nested
+        // entries against both caps.
+        $phar = $this->buildTar([
+            'GeoLite2-Country_20260427/GeoLite2-Country.mmdb' => str_repeat('m', 2048),
+            'GeoLite2-Country_20260427/COPYRIGHT.txt' => 'CC',
+        ]);
+
+        $this->expectException(DownloaderException::class);
+        $this->expectExceptionMessage('GeoLite2-Country.mmdb uncompressed size 2048 > cap 1024');
+        MaxMindDownloader::assertSizeBudget(
+            $phar,
+            'country',
+            maxEntryBytes: 1024,
+        );
+    }
+
+    /**
+     * @param array<string, string> $entries map of entry path → contents
+     */
+    private function buildTar(array $entries): PharData
+    {
+        $tarPath = $this->tmpDir . '/' . bin2hex(random_bytes(4)) . '.tar';
+        $writer = new PharData($tarPath);
+        foreach ($entries as $name => $contents) {
+            $writer->addFromString($name, $contents);
+        }
+        // PharData iteration on the *write* handle returns nothing until
+        // the file is closed and re-opened. Production builds the
+        // PharData from an already-on-disk tarball, so this matches the
+        // production path.
+        unset($writer);
+
+        return new PharData($tarPath);
+    }
+}