mirror of https://github.com/interlegis/sapl.git
Browse Source
GeoIP (docker/Dockerfile): Remove at-build-time MaxMind download (required BuildKit secrets, caused cache-miss issues). Replace with COPY from docker/geoip/GeoLite2-ASN.mmdb (git-ignored binary). If absent, build succeeds with ASN blocking disabled. Add docker/geoip/update_geoip.sh — run before each build to refresh the database from MaxMind using MAXMIND_LICENSE_KEY from env or .env file. Redis inspection / synthetic test data: Add docker/scripts/redis_populate_test_data.py — injects synthetic rl:* entries into Redis DB1 to validate key schema and blocking thresholds without waiting for real traffic. Supports DRY_RUN and CLEAR modes. Add §4.5 (Redis CLI quick-reference + RedisInsight guide) to rate-limiter-v2.md. Auth-aware @ratelimit decorators (smart_rate / smart_key): All 51 @ratelimit decorators across 9 files used rate=RATE_LIMITER_RATE (35/m) regardless of authentication, silently over-throttling logged-in users compared to what RateLimitMiddleware allows (120/m). Add smart_key() and smart_rate() to sapl/middleware/ratelimit.py: - smart_key: user pk for authenticated requests, masked IP for anon - smart_rate: RATE_LIMITER_RATE_AUTHENTICATED (120/m) for auth, RATE_LIMITER_RATE (35/m) for anon — mirrors middleware thresholds Update all 51 decorators across crud/base.py + 8 view files. Remove now-unused RATE_LIMITER_RATE imports from those files. Cache KEY_PREFIX (settings.py): Change KEY_PREFIX from POD_NAMESPACE ("sapl") to f"cache:{POD_NAMESPACE}" so DB0 cache keys are unambiguously prefixed cache:{ns}:* — distinct from any future static or file cache key patterns. Update key schema table and code examples in rate-limiter-v2.md to match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>rate-limiter-2026
16 changed files with 499 additions and 141 deletions
@ -0,0 +1,3 @@ |
|||
# GeoLite2 binary databases are git-ignored (large binaries, updated frequently). |
|||
# Run update_geoip.sh before each docker build to refresh. |
|||
*.mmdb |
|||
@ -0,0 +1,73 @@ |
|||
#!/usr/bin/env bash |
|||
# update_geoip.sh — download / refresh GeoLite2-ASN.mmdb |
|||
# |
|||
# Run this script before building a new Docker image so the image bundles |
|||
# an up-to-date MaxMind ASN database. The .mmdb binary is git-ignored; |
|||
# only this script is tracked. |
|||
# |
|||
# Usage: |
|||
# # Option 1 — key in environment |
|||
# MAXMIND_LICENSE_KEY=your_key bash docker/geoip/update_geoip.sh |
|||
# |
|||
# # Option 2 — key in project .env file |
|||
# bash docker/geoip/update_geoip.sh |
|||
# |
|||
# The script writes GeoLite2-ASN.mmdb to the same directory as itself so |
|||
# the Dockerfile COPY step can find it at docker/geoip/GeoLite2-ASN.mmdb. |
|||
# |
|||
# Suggested automation: run via a host cron job or CI pipeline step |
|||
# before triggering a docker build, e.g.: |
|||
# |
|||
# # /etc/cron.weekly/update-sapl-geoip |
|||
# #!/bin/bash |
|||
# cd /path/to/sapl && bash docker/geoip/update_geoip.sh |
|||
|
|||
set -Eeuo pipefail |
|||
|
|||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" |
|||
OUT_FILE="$SCRIPT_DIR/GeoLite2-ASN.mmdb" |
|||
|
|||
# ── Resolve the license key ──────────────────────────────────────────────── |
|||
if [[ -z "${MAXMIND_LICENSE_KEY:-}" ]]; then |
|||
# Try the project .env (two directories up from docker/geoip/) |
|||
ENV_FILE="$(dirname "$(dirname "$SCRIPT_DIR")")/.env" |
|||
if [[ -f "$ENV_FILE" ]]; then |
|||
MAXMIND_LICENSE_KEY="$(grep -E '^MAXMIND_LICENSE_KEY=' "$ENV_FILE" 2>/dev/null \ |
|||
| cut -d= -f2- | tr -d '[:space:]' || true)" |
|||
fi |
|||
fi |
|||
|
|||
if [[ -z "${MAXMIND_LICENSE_KEY:-}" ]]; then |
|||
echo "ERROR: MAXMIND_LICENSE_KEY is not set." >&2 |
|||
echo " Set it in the environment or add MAXMIND_LICENSE_KEY=<key> to .env" >&2 |
|||
exit 1 |
|||
fi |
|||
|
|||
# ── Download ─────────────────────────────────────────────────────────────── |
|||
URL="https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-ASN&license_key=${MAXMIND_LICENSE_KEY}&suffix=tar.gz" |
|||
|
|||
echo "[geoip] Downloading GeoLite2-ASN from MaxMind..." |
|||
tmpdir="$(mktemp -d)" |
|||
trap 'rm -rf "$tmpdir"' EXIT |
|||
|
|||
curl -fsSL --max-time 60 "$URL" | tar -xz --strip-components=1 -C "$tmpdir" |
|||
mv "$tmpdir"/GeoLite2-ASN.mmdb "$OUT_FILE" |
|||
|
|||
echo "[geoip] Saved: $OUT_FILE" |
|||
echo "[geoip] Build date: $(python3 -c " |
|||
import struct, datetime, pathlib |
|||
data = pathlib.Path('$OUT_FILE').read_bytes() |
|||
# MaxMind DB build epoch is in the last 16 bytes of the metadata section |
|||
marker = b'\xab\xcd\xefMaxMind.com' |
|||
idx = data.rfind(marker) |
|||
if idx >= 0: |
|||
# search for 'build_epoch' key nearby |
|||
chunk = data[idx:idx+512] |
|||
pos = chunk.find(b'build_epoch') |
|||
if pos >= 0: |
|||
val_start = pos + len(b'build_epoch') + 1 |
|||
epoch = struct.unpack('>Q', chunk[val_start+1:val_start+9])[0] |
|||
print(datetime.datetime.utcfromtimestamp(epoch).strftime('%Y-%m-%d')) |
|||
exit() |
|||
print('unknown') |
|||
" 2>/dev/null || echo "unknown")" |
|||
@ -0,0 +1,176 @@ |
|||
#!/usr/bin/env python3 |
|||
""" |
|||
redis_populate_test_data.py — inject synthetic rate-limiter entries into Redis. |
|||
|
|||
Purpose: validate that RateLimitMiddleware reads the expected key schema, |
|||
that Redis CLI / RedisInsight shows the right structure, and that blocking |
|||
logic fires correctly without waiting for real traffic. |
|||
|
|||
Usage: |
|||
# Against docker-compose Redis (default) |
|||
python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
# Against a different host/port |
|||
REDIS_URL=redis://localhost:6379 python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
# Show what would be written without actually writing |
|||
DRY_RUN=1 python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
# Clear all synthetic keys written by a previous run |
|||
CLEAR=1 python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
Key schema (DB 1 — rate limiter): |
|||
rl:ip:{ip}:reqs INCR counter — anonymous request count (TTL 60s) |
|||
rl:ip:{ip}:blocked string "1" — IP hard-blocked (TTL 300s) |
|||
rl:{ns}:user:{uid}:reqs INCR counter — auth user request count (TTL 60s) |
|||
rl:{ns}:user:{uid}:blocked string "1" — user hard-blocked (TTL 300s) |
|||
rl:{ns}:ip:{ip}:w:{bucket} INCR — namespace/IP sliding window (TTL 120s) |
|||
""" |
|||
|
|||
import os |
|||
import sys |
|||
import time |
|||
|
|||
# ── dependency check ────────────────────────────────────────────────────── |
|||
try: |
|||
import redis |
|||
except ImportError: |
|||
print("ERROR: redis-py not installed. Run: pip install redis", file=sys.stderr) |
|||
sys.exit(1) |
|||
|
|||
# ── config ──────────────────────────────────────────────────────────────── |
|||
REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379") |
|||
RATELIMIT_DB = 1 # DB1 is the rate-limiter database |
|||
DRY_RUN = os.environ.get("DRY_RUN", "0").lower() in ("1", "true", "yes") |
|||
CLEAR = os.environ.get("CLEAR", "0").lower() in ("1", "true", "yes") |
|||
|
|||
# Synthetic values — tweak to exercise different code paths |
|||
NAMESPACE = "sapl" # POD_NAMESPACE value (hostname or k8s namespace) |
|||
ANON_WINDOW = 60 # seconds — must match settings.RATE_LIMITER_RATE period |
|||
AUTH_WINDOW = 60 |
|||
BLOCK_TTL = 300 |
|||
|
|||
TEST_IPS = [ |
|||
"203.0.113.1", # below threshold (20 reqs) |
|||
"203.0.113.2", # AT threshold (35 reqs — should trigger block) |
|||
"203.0.113.3", # already blocked |
|||
"203.0.113.4", # namespace/window counter near threshold |
|||
] |
|||
|
|||
TEST_USERS = [ |
|||
{"uid": "42", "reqs": 50, "blocked": False}, # normal auth user |
|||
{"uid": "99", "reqs": 120, "blocked": False}, # AT auth threshold |
|||
{"uid": "7", "reqs": 10, "blocked": True}, # pre-blocked user |
|||
] |
|||
|
|||
# ── helpers ─────────────────────────────────────────────────────────────── |
|||
|
|||
def key_ip_reqs(ip): |
|||
return f"rl:ip:{ip}:reqs" |
|||
|
|||
def key_ip_blocked(ip): |
|||
return f"rl:ip:{ip}:blocked" |
|||
|
|||
def key_user_reqs(ns, uid): |
|||
return f"rl:{ns}:user:{uid}:reqs" |
|||
|
|||
def key_user_blocked(ns, uid): |
|||
return f"rl:{ns}:user:{uid}:blocked" |
|||
|
|||
def key_ns_window(ns, ip, bucket): |
|||
return f"rl:{ns}:ip:{ip}:w:{bucket}" |
|||
|
|||
|
|||
def write(r, key, value, ttl, label): |
|||
if DRY_RUN: |
|||
print(f" [dry-run] SET {key!r} = {value!r} EX {ttl} ({label})") |
|||
return |
|||
if isinstance(value, int): |
|||
pipe = r.pipeline() |
|||
pipe.set(key, value, ex=ttl) |
|||
pipe.execute() |
|||
else: |
|||
r.set(key, value, ex=ttl) |
|||
print(f" SET {key!r} = {value!r} EX {ttl}s ({label})") |
|||
|
|||
|
|||
def delete_pattern(r, pattern): |
|||
keys = r.keys(pattern) |
|||
if keys: |
|||
r.delete(*keys) |
|||
print(f" DEL {len(keys)} keys matching {pattern!r}") |
|||
else: |
|||
print(f" (no keys matching {pattern!r})") |
|||
|
|||
|
|||
# ── main ────────────────────────────────────────────────────────────────── |
|||
|
|||
def main(): |
|||
r = redis.from_url(REDIS_URL, db=RATELIMIT_DB, decode_responses=True) |
|||
try: |
|||
r.ping() |
|||
except redis.ConnectionError as exc: |
|||
print(f"ERROR: cannot connect to Redis at {REDIS_URL}: {exc}", file=sys.stderr) |
|||
sys.exit(1) |
|||
|
|||
print(f"Redis: {REDIS_URL} DB={RATELIMIT_DB} dry_run={DRY_RUN} clear={CLEAR}") |
|||
print() |
|||
|
|||
# ── clear mode ──────────────────────────────────────────────────────── |
|||
if CLEAR: |
|||
print("=== Clearing synthetic test keys ===") |
|||
for ip in TEST_IPS: |
|||
delete_pattern(r, f"rl:ip:{ip}:*") |
|||
delete_pattern(r, f"rl:{NAMESPACE}:ip:{ip}:*") |
|||
for u in TEST_USERS: |
|||
delete_pattern(r, f"rl:{NAMESPACE}:user:{u['uid']}:*") |
|||
print("Done.") |
|||
return |
|||
|
|||
# ── anonymous IP counters ───────────────────────────────────────────── |
|||
print("=== Anonymous IP request counters (DB1) ===") |
|||
write(r, key_ip_reqs(TEST_IPS[0]), 20, ANON_WINDOW, "below threshold") |
|||
write(r, key_ip_reqs(TEST_IPS[1]), 35, ANON_WINDOW, "AT threshold → middleware will block on next req") |
|||
write(r, key_ip_reqs(TEST_IPS[3]), 30, ANON_WINDOW, "below threshold") |
|||
print() |
|||
|
|||
# ── blocked IPs ─────────────────────────────────────────────────────── |
|||
print("=== Blocked IPs (DB1) ===") |
|||
write(r, key_ip_blocked(TEST_IPS[2]), "1", BLOCK_TTL, "hard-blocked") |
|||
print() |
|||
|
|||
# ── namespace/IP sliding window ─────────────────────────────────────── |
|||
print("=== Namespace/IP sliding window (DB1) ===") |
|||
bucket = int(time.time() // ANON_WINDOW) |
|||
write(r, key_ns_window(NAMESPACE, TEST_IPS[3], bucket), 34, ANON_WINDOW * 2, |
|||
"near window threshold (next req triggers ua_rotation block)") |
|||
print() |
|||
|
|||
# ── authenticated user counters ─────────────────────────────────────── |
|||
print("=== Authenticated user request counters (DB1) ===") |
|||
for u in TEST_USERS: |
|||
if not u["blocked"]: |
|||
write(r, key_user_reqs(NAMESPACE, u["uid"]), u["reqs"], AUTH_WINDOW, |
|||
f"uid={u['uid']} reqs={u['reqs']}") |
|||
print() |
|||
|
|||
# ── blocked users ───────────────────────────────────────────────────── |
|||
print("=== Blocked users (DB1) ===") |
|||
for u in TEST_USERS: |
|||
if u["blocked"]: |
|||
write(r, key_user_blocked(NAMESPACE, u["uid"]), "1", BLOCK_TTL, |
|||
f"uid={u['uid']} hard-blocked") |
|||
print() |
|||
|
|||
# ── summary ─────────────────────────────────────────────────────────── |
|||
if not DRY_RUN: |
|||
all_keys = r.keys("rl:*") |
|||
print(f"=== DB{RATELIMIT_DB} now contains {len(all_keys)} rl:* keys ===") |
|||
for k in sorted(all_keys): |
|||
ttl = r.ttl(k) |
|||
val = r.get(k) |
|||
print(f" {k!r:55s} val={val!r:5} ttl={ttl}s") |
|||
|
|||
|
|||
if __name__ == "__main__": |
|||
main() |
|||
Loading…
Reference in new issue