mirror of https://github.com/interlegis/sapl.git
Browse Source
GeoIP (docker/Dockerfile): Remove at-build-time MaxMind download (required BuildKit secrets, caused cache-miss issues). Replace with COPY from docker/geoip/GeoLite2-ASN.mmdb (git-ignored binary). If absent, build succeeds with ASN blocking disabled. Add docker/geoip/update_geoip.sh — run before each build to refresh the database from MaxMind using MAXMIND_LICENSE_KEY from env or .env file. Redis inspection / synthetic test data: Add docker/scripts/redis_populate_test_data.py — injects synthetic rl:* entries into Redis DB1 to validate key schema and blocking thresholds without waiting for real traffic. Supports DRY_RUN and CLEAR modes. Add §4.5 (Redis CLI quick-reference + RedisInsight guide) to rate-limiter-v2.md. Auth-aware @ratelimit decorators (smart_rate / smart_key): All 51 @ratelimit decorators across 9 files used rate=RATE_LIMITER_RATE (35/m) regardless of authentication, silently over-throttling logged-in users compared to what RateLimitMiddleware allows (120/m). Add smart_key() and smart_rate() to sapl/middleware/ratelimit.py: - smart_key: user pk for authenticated requests, masked IP for anon - smart_rate: RATE_LIMITER_RATE_AUTHENTICATED (120/m) for auth, RATE_LIMITER_RATE (35/m) for anon — mirrors middleware thresholds Update all 51 decorators across crud/base.py + 8 view files. Remove now-unused RATE_LIMITER_RATE imports from those files. Cache KEY_PREFIX (settings.py): Change KEY_PREFIX from POD_NAMESPACE ("sapl") to f"cache:{POD_NAMESPACE}" so DB0 cache keys are unambiguously prefixed cache:{ns}:* — distinct from any future static or file cache key patterns. Update key schema table and code examples in rate-limiter-v2.md to match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>rate-limiter-2026
16 changed files with 499 additions and 141 deletions
@ -0,0 +1,3 @@ |
|||||
|
# GeoLite2 binary databases are git-ignored (large binaries, updated frequently). |
||||
|
# Run update_geoip.sh before each docker build to refresh. |
||||
|
*.mmdb |
||||
@ -0,0 +1,73 @@ |
|||||
|
#!/usr/bin/env bash |
||||
|
# update_geoip.sh — download / refresh GeoLite2-ASN.mmdb |
||||
|
# |
||||
|
# Run this script before building a new Docker image so the image bundles |
||||
|
# an up-to-date MaxMind ASN database. The .mmdb binary is git-ignored; |
||||
|
# only this script is tracked. |
||||
|
# |
||||
|
# Usage: |
||||
|
# # Option 1 — key in environment |
||||
|
# MAXMIND_LICENSE_KEY=your_key bash docker/geoip/update_geoip.sh |
||||
|
# |
||||
|
# # Option 2 — key in project .env file |
||||
|
# bash docker/geoip/update_geoip.sh |
||||
|
# |
||||
|
# The script writes GeoLite2-ASN.mmdb to the same directory as itself so |
||||
|
# the Dockerfile COPY step can find it at docker/geoip/GeoLite2-ASN.mmdb. |
||||
|
# |
||||
|
# Suggested automation: run via a host cron job or CI pipeline step |
||||
|
# before triggering a docker build, e.g.: |
||||
|
# |
||||
|
# # /etc/cron.weekly/update-sapl-geoip |
||||
|
# #!/bin/bash |
||||
|
# cd /path/to/sapl && bash docker/geoip/update_geoip.sh |
||||
|
|
||||
|
set -Eeuo pipefail |
||||
|
|
||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" |
||||
|
OUT_FILE="$SCRIPT_DIR/GeoLite2-ASN.mmdb" |
||||
|
|
||||
|
# ── Resolve the license key ──────────────────────────────────────────────── |
||||
|
if [[ -z "${MAXMIND_LICENSE_KEY:-}" ]]; then |
||||
|
# Try the project .env (two directories up from docker/geoip/) |
||||
|
ENV_FILE="$(dirname "$(dirname "$SCRIPT_DIR")")/.env" |
||||
|
if [[ -f "$ENV_FILE" ]]; then |
||||
|
MAXMIND_LICENSE_KEY="$(grep -E '^MAXMIND_LICENSE_KEY=' "$ENV_FILE" 2>/dev/null \ |
||||
|
| cut -d= -f2- | tr -d '[:space:]' || true)" |
||||
|
fi |
||||
|
fi |
||||
|
|
||||
|
if [[ -z "${MAXMIND_LICENSE_KEY:-}" ]]; then |
||||
|
echo "ERROR: MAXMIND_LICENSE_KEY is not set." >&2 |
||||
|
echo " Set it in the environment or add MAXMIND_LICENSE_KEY=<key> to .env" >&2 |
||||
|
exit 1 |
||||
|
fi |
||||
|
|
||||
|
# ── Download ─────────────────────────────────────────────────────────────── |
||||
|
URL="https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-ASN&license_key=${MAXMIND_LICENSE_KEY}&suffix=tar.gz" |
||||
|
|
||||
|
echo "[geoip] Downloading GeoLite2-ASN from MaxMind..." |
||||
|
tmpdir="$(mktemp -d)" |
||||
|
trap 'rm -rf "$tmpdir"' EXIT |
||||
|
|
||||
|
curl -fsSL --max-time 60 "$URL" | tar -xz --strip-components=1 -C "$tmpdir" |
||||
|
mv "$tmpdir"/GeoLite2-ASN.mmdb "$OUT_FILE" |
||||
|
|
||||
|
echo "[geoip] Saved: $OUT_FILE" |
||||
|
echo "[geoip] Build date: $(python3 -c " |
||||
|
import struct, datetime, pathlib |
||||
|
data = pathlib.Path('$OUT_FILE').read_bytes() |
||||
|
# MaxMind DB build epoch is in the last 16 bytes of the metadata section |
||||
|
marker = b'\xab\xcd\xefMaxMind.com' |
||||
|
idx = data.rfind(marker) |
||||
|
if idx >= 0: |
||||
|
# search for 'build_epoch' key nearby |
||||
|
chunk = data[idx:idx+512] |
||||
|
pos = chunk.find(b'build_epoch') |
||||
|
if pos >= 0: |
||||
|
val_start = pos + len(b'build_epoch') + 1 |
||||
|
epoch = struct.unpack('>Q', chunk[val_start+1:val_start+9])[0] |
||||
|
print(datetime.datetime.utcfromtimestamp(epoch).strftime('%Y-%m-%d')) |
||||
|
exit() |
||||
|
print('unknown') |
||||
|
" 2>/dev/null || echo "unknown")" |
||||
@ -0,0 +1,176 @@ |
|||||
|
#!/usr/bin/env python3 |
||||
|
""" |
||||
|
redis_populate_test_data.py — inject synthetic rate-limiter entries into Redis. |
||||
|
|
||||
|
Purpose: validate that RateLimitMiddleware reads the expected key schema, |
||||
|
that Redis CLI / RedisInsight shows the right structure, and that blocking |
||||
|
logic fires correctly without waiting for real traffic. |
||||
|
|
||||
|
Usage: |
||||
|
# Against docker-compose Redis (default) |
||||
|
python3 docker/scripts/redis_populate_test_data.py |
||||
|
|
||||
|
# Against a different host/port |
||||
|
REDIS_URL=redis://localhost:6379 python3 docker/scripts/redis_populate_test_data.py |
||||
|
|
||||
|
# Show what would be written without actually writing |
||||
|
DRY_RUN=1 python3 docker/scripts/redis_populate_test_data.py |
||||
|
|
||||
|
# Clear all synthetic keys written by a previous run |
||||
|
CLEAR=1 python3 docker/scripts/redis_populate_test_data.py |
||||
|
|
||||
|
Key schema (DB 1 — rate limiter): |
||||
|
rl:ip:{ip}:reqs INCR counter — anonymous request count (TTL 60s) |
||||
|
rl:ip:{ip}:blocked string "1" — IP hard-blocked (TTL 300s) |
||||
|
rl:{ns}:user:{uid}:reqs INCR counter — auth user request count (TTL 60s) |
||||
|
rl:{ns}:user:{uid}:blocked string "1" — user hard-blocked (TTL 300s) |
||||
|
rl:{ns}:ip:{ip}:w:{bucket} INCR — namespace/IP sliding window (TTL 120s) |
||||
|
""" |
||||
|
|
||||
|
import os |
||||
|
import sys |
||||
|
import time |
||||
|
|
||||
|
# ── dependency check ────────────────────────────────────────────────────── |
||||
|
try: |
||||
|
import redis |
||||
|
except ImportError: |
||||
|
print("ERROR: redis-py not installed. Run: pip install redis", file=sys.stderr) |
||||
|
sys.exit(1) |
||||
|
|
||||
|
# ── config ──────────────────────────────────────────────────────────────── |
||||
|
REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379") |
||||
|
RATELIMIT_DB = 1 # DB1 is the rate-limiter database |
||||
|
DRY_RUN = os.environ.get("DRY_RUN", "0").lower() in ("1", "true", "yes") |
||||
|
CLEAR = os.environ.get("CLEAR", "0").lower() in ("1", "true", "yes") |
||||
|
|
||||
|
# Synthetic values — tweak to exercise different code paths |
||||
|
NAMESPACE = "sapl" # POD_NAMESPACE value (hostname or k8s namespace) |
||||
|
ANON_WINDOW = 60 # seconds — must match settings.RATE_LIMITER_RATE period |
||||
|
AUTH_WINDOW = 60 |
||||
|
BLOCK_TTL = 300 |
||||
|
|
||||
|
TEST_IPS = [ |
||||
|
"203.0.113.1", # below threshold (20 reqs) |
||||
|
"203.0.113.2", # AT threshold (35 reqs — should trigger block) |
||||
|
"203.0.113.3", # already blocked |
||||
|
"203.0.113.4", # namespace/window counter near threshold |
||||
|
] |
||||
|
|
||||
|
TEST_USERS = [ |
||||
|
{"uid": "42", "reqs": 50, "blocked": False}, # normal auth user |
||||
|
{"uid": "99", "reqs": 120, "blocked": False}, # AT auth threshold |
||||
|
{"uid": "7", "reqs": 10, "blocked": True}, # pre-blocked user |
||||
|
] |
||||
|
|
||||
|
# ── helpers ─────────────────────────────────────────────────────────────── |
||||
|
|
||||
|
def key_ip_reqs(ip): |
||||
|
return f"rl:ip:{ip}:reqs" |
||||
|
|
||||
|
def key_ip_blocked(ip): |
||||
|
return f"rl:ip:{ip}:blocked" |
||||
|
|
||||
|
def key_user_reqs(ns, uid): |
||||
|
return f"rl:{ns}:user:{uid}:reqs" |
||||
|
|
||||
|
def key_user_blocked(ns, uid): |
||||
|
return f"rl:{ns}:user:{uid}:blocked" |
||||
|
|
||||
|
def key_ns_window(ns, ip, bucket): |
||||
|
return f"rl:{ns}:ip:{ip}:w:{bucket}" |
||||
|
|
||||
|
|
||||
|
def write(r, key, value, ttl, label): |
||||
|
if DRY_RUN: |
||||
|
print(f" [dry-run] SET {key!r} = {value!r} EX {ttl} ({label})") |
||||
|
return |
||||
|
if isinstance(value, int): |
||||
|
pipe = r.pipeline() |
||||
|
pipe.set(key, value, ex=ttl) |
||||
|
pipe.execute() |
||||
|
else: |
||||
|
r.set(key, value, ex=ttl) |
||||
|
print(f" SET {key!r} = {value!r} EX {ttl}s ({label})") |
||||
|
|
||||
|
|
||||
|
def delete_pattern(r, pattern): |
||||
|
keys = r.keys(pattern) |
||||
|
if keys: |
||||
|
r.delete(*keys) |
||||
|
print(f" DEL {len(keys)} keys matching {pattern!r}") |
||||
|
else: |
||||
|
print(f" (no keys matching {pattern!r})") |
||||
|
|
||||
|
|
||||
|
# ── main ────────────────────────────────────────────────────────────────── |
||||
|
|
||||
|
def main(): |
||||
|
r = redis.from_url(REDIS_URL, db=RATELIMIT_DB, decode_responses=True) |
||||
|
try: |
||||
|
r.ping() |
||||
|
except redis.ConnectionError as exc: |
||||
|
print(f"ERROR: cannot connect to Redis at {REDIS_URL}: {exc}", file=sys.stderr) |
||||
|
sys.exit(1) |
||||
|
|
||||
|
print(f"Redis: {REDIS_URL} DB={RATELIMIT_DB} dry_run={DRY_RUN} clear={CLEAR}") |
||||
|
print() |
||||
|
|
||||
|
# ── clear mode ──────────────────────────────────────────────────────── |
||||
|
if CLEAR: |
||||
|
print("=== Clearing synthetic test keys ===") |
||||
|
for ip in TEST_IPS: |
||||
|
delete_pattern(r, f"rl:ip:{ip}:*") |
||||
|
delete_pattern(r, f"rl:{NAMESPACE}:ip:{ip}:*") |
||||
|
for u in TEST_USERS: |
||||
|
delete_pattern(r, f"rl:{NAMESPACE}:user:{u['uid']}:*") |
||||
|
print("Done.") |
||||
|
return |
||||
|
|
||||
|
# ── anonymous IP counters ───────────────────────────────────────────── |
||||
|
print("=== Anonymous IP request counters (DB1) ===") |
||||
|
write(r, key_ip_reqs(TEST_IPS[0]), 20, ANON_WINDOW, "below threshold") |
||||
|
write(r, key_ip_reqs(TEST_IPS[1]), 35, ANON_WINDOW, "AT threshold → middleware will block on next req") |
||||
|
write(r, key_ip_reqs(TEST_IPS[3]), 30, ANON_WINDOW, "below threshold") |
||||
|
print() |
||||
|
|
||||
|
# ── blocked IPs ─────────────────────────────────────────────────────── |
||||
|
print("=== Blocked IPs (DB1) ===") |
||||
|
write(r, key_ip_blocked(TEST_IPS[2]), "1", BLOCK_TTL, "hard-blocked") |
||||
|
print() |
||||
|
|
||||
|
# ── namespace/IP sliding window ─────────────────────────────────────── |
||||
|
print("=== Namespace/IP sliding window (DB1) ===") |
||||
|
bucket = int(time.time() // ANON_WINDOW) |
||||
|
write(r, key_ns_window(NAMESPACE, TEST_IPS[3], bucket), 34, ANON_WINDOW * 2, |
||||
|
"near window threshold (next req triggers ua_rotation block)") |
||||
|
print() |
||||
|
|
||||
|
# ── authenticated user counters ─────────────────────────────────────── |
||||
|
print("=== Authenticated user request counters (DB1) ===") |
||||
|
for u in TEST_USERS: |
||||
|
if not u["blocked"]: |
||||
|
write(r, key_user_reqs(NAMESPACE, u["uid"]), u["reqs"], AUTH_WINDOW, |
||||
|
f"uid={u['uid']} reqs={u['reqs']}") |
||||
|
print() |
||||
|
|
||||
|
# ── blocked users ───────────────────────────────────────────────────── |
||||
|
print("=== Blocked users (DB1) ===") |
||||
|
for u in TEST_USERS: |
||||
|
if u["blocked"]: |
||||
|
write(r, key_user_blocked(NAMESPACE, u["uid"]), "1", BLOCK_TTL, |
||||
|
f"uid={u['uid']} hard-blocked") |
||||
|
print() |
||||
|
|
||||
|
# ── summary ─────────────────────────────────────────────────────────── |
||||
|
if not DRY_RUN: |
||||
|
all_keys = r.keys("rl:*") |
||||
|
print(f"=== DB{RATELIMIT_DB} now contains {len(all_keys)} rl:* keys ===") |
||||
|
for k in sorted(all_keys): |
||||
|
ttl = r.ttl(k) |
||||
|
val = r.get(k) |
||||
|
print(f" {k!r:55s} val={val!r:5} ttl={ttl}s") |
||||
|
|
||||
|
|
||||
|
if __name__ == "__main__": |
||||
|
main() |
||||
Loading…
Reference in new issue