| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147 |
- <?php
- declare(strict_types=1);
- namespace App\Infrastructure\Logging;
- use Monolog\LogRecord;
- use Monolog\Processor\ProcessorInterface;
- /**
- * Scrubs sensitive values out of Monolog records before they hit a handler
- * (SPEC §M14.4).
- *
- * Two layers of defence:
- * 1. **Key-based redaction** in `context` and `extra`. Any key whose name
- * matches one of the sensitive-key patterns gets its value replaced
- * with `***`.
- * 2. **Pattern-based redaction** of the rendered `message` and any string
- * values left in `context` / `extra`. Catches Bearer tokens that
- * slipped into the message via `sprintf` or were embedded in a free-
- * form string. The token kind prefix is preserved so triage logs
- * ("which kind of token failed?") stay useful.
- *
- * The processor is intentionally simple: any new sensitive-shaped data we
- * find in production should be added to either the key-list or the regex
- * list with a one-line PR.
- */
- final class SecretScrubbingProcessor implements ProcessorInterface
- {
- private const REDACTED = '***';
- /**
- * Lower-case substrings; we redact a key if any of these appear in it.
- * Examples that get hit:
- * `authorization`, `Authorization`, `auth_token`,
- * `password`, `password_hash`, `LOCAL_ADMIN_PASSWORD_HASH`,
- * `oidc_client_secret`, `client_secret`,
- * `maxmind_license_key`, `ipinfo_token`,
- * `db_mysql_password`, `internal_job_token`,
- * `ui_service_token`, `bearer`, `cookie`, `set-cookie`.
- */
- private const SENSITIVE_KEY_NEEDLES = [
- 'password',
- 'authorization',
- 'auth_token',
- 'access_token',
- 'refresh_token',
- 'bearer',
- 'secret',
- 'license_key',
- 'license-key',
- 'license_token',
- 'ipinfo_token',
- 'service_token',
- 'job_token',
- 'cookie',
- ];
- /**
- * Pattern → replacement pairs used on string values. The token regex
- * preserves the irdb prefix + kind so logs still show which token kind
- * was involved without leaking the secret half.
- *
- * @var list<array{0: string, 1: string|callable(array<int|string, string>): string}>
- */
- private const VALUE_PATTERNS = [
- // Bearer header value, with or without the keyword. Replaces the
- // value but keeps the kind prefix as a triage breadcrumb.
- ['/(Bearer\s+irdb_(?:rep|con|adm|svc)_)[A-Z2-7]{32}/', '$1***'],
- // SEC_REVIEW F65: Bearer with any non-trivial value. The
- // floor was {20,} which let a < 20-char Bearer slip through;
- // dropped to {8,} which still excludes the common literal
- // strings without false-positive matching prose.
- ['/(Bearer\s+)[A-Za-z0-9._\-]{8,}/', '$1***'],
- // SEC_REVIEW F65: raw JWT (`header.payload.signature`)
- // anywhere in the message or value. Anchored on `eyJ`
- // because every JWT header is the base64url encoding of a
- // JSON object that starts with `{"…`, which is `eyJ…`.
- // Anchoring eliminates false positives like `192.168.1.1`
- // or `lib.so.6` — those don't start with `eyJ`. Each
- // segment requires ≥4 chars to skip pathological short
- // matches. The replacement keeps the `eyJ` prefix as a
- // triage breadcrumb.
- ['/\beyJ[A-Za-z0-9_-]{4,}\.[A-Za-z0-9_-]{4,}\.[A-Za-z0-9_-]{4,}\b/', 'eyJ***'],
- // Bare irdb_<kind>_<32 base32> tokens that aren't preceded by Bearer.
- ['/\birdb_(rep|con|adm|svc)_[A-Z2-7]{32}\b/', 'irdb_$1_***'],
- // Argon2 password hashes.
- ['/\$argon2(?:i|id|d)\$[^\s\'"]+/', '$argon2***'],
- // bcrypt password hashes.
- ['/\$2[aby]?\$\d{2}\$[A-Za-z0-9.\/]{53}/', '$2***'],
- ];
- public function __invoke(LogRecord $record): LogRecord
- {
- $context = self::scrubArray($record->context);
- $extra = self::scrubArray($record->extra);
- $message = self::scrubString($record->message);
- return $record->with(message: $message, context: $context, extra: $extra);
- }
- /**
- * @param array<array-key, mixed> $data
- * @return array<array-key, mixed>
- */
- private static function scrubArray(array $data): array
- {
- $out = [];
- foreach ($data as $key => $value) {
- $keyHit = is_string($key) && self::isSensitiveKey($key);
- if ($keyHit) {
- $out[$key] = self::REDACTED;
- continue;
- }
- if (is_array($value)) {
- $out[$key] = self::scrubArray($value);
- } elseif (is_string($value)) {
- $out[$key] = self::scrubString($value);
- } else {
- $out[$key] = $value;
- }
- }
- return $out;
- }
- private static function isSensitiveKey(string $key): bool
- {
- $lower = strtolower($key);
- foreach (self::SENSITIVE_KEY_NEEDLES as $needle) {
- if (str_contains($lower, $needle)) {
- return true;
- }
- }
- return false;
- }
- private static function scrubString(string $value): string
- {
- foreach (self::VALUE_PATTERNS as [$pattern, $replacement]) {
- $value = (string) preg_replace($pattern, (string) $replacement, $value);
- }
- return $value;
- }
- }
|