Phase 1-4: Redis infra, rate limiter middleware, cache layer, nginx hardening

docker/Dockerfile: - GeoIP offline build with MaxMind secret; optional build args for graphviz, poppler, psql client; envsubst for nginx burst vars docker/docker-compose.yaml: - saplredis service (redis:7-alpine, allkeys-lru, 512 MB) - REDIS_URL + CACHE_BACKEND wired into sapl service docker/startup_scripts/start.sh: - configure_redis_cache(): builds CACHES dict, sets REDIS_CACHE waffle switch, falls back to file cache gracefully - POD_NAMESPACE resolution (k8s Downward API → hostname fallback) - DATABASE_URL exported before migrate docker/k8s/redis/ (moved from docker/k8s/): - redis-configmap.yaml, redis-deployment.yaml, redis-service.yaml - ClusterIP service on port 6379, sapl-redis namespace docker/k8s/sapl-k8s.yaml: - REDIS_URL env var injected; app.kubernetes.io/name=sapl label for fleet-wide discovery sapl/middleware/test_ratelimiter.py: - Unit tests for RateLimitMiddleware with mocked Redis scripts/test_ratelimiter.py: - CLI smoke-test: fires N requests and reports first 429 Removed: rate-limiter-v2.md (content migrated to plan/RATE_LIMITER_PLAN.md), scripts/test_ratelimiter.sh (replaced by .py), docker/k8s/README.md (merged into plan/RATE_LIMITER_PLAN.md), docker/scripts/redis_populate_test_data.py (renamed to redis_inject_test_data.py) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 weeks ago · 027ce68253
14 changed files with 514 additions and 1732 deletions
--- a/.gitignore
+++ b/.gitignore
@ -110,3 +110,5 @@ media/*
 !media/.gitkeep

 restauracoes/*
+
+.claude
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@ -51,7 +51,7 @@ ENV LANG=C.UTF-8 LC_ALL=C.UTF-8 PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1 \
 RUN set -eux; \
    apt-get update; \
    apt-get install -y --no-install-recommends \
-      curl jq bash tzdata fontconfig tini libmagic1 \
+      curl jq bash tzdata fontconfig tini libmagic1 gettext-base \
      libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf-2.0-0 \
      libharfbuzz0b libfreetype6 libjpeg62-turbo zlib1g fonts-dejavu-core; \
    if [ "$WITH_GRAPHVIZ" = "1" ]; then apt-get install -y --no-install-recommends graphviz; fi; \
@ -96,7 +96,7 @@ COPY . /var/interlegis/sapl/
 # disabled and nginx will emit a startup warning.
 RUN if [ "$WITH_NGINX" = "1" ]; then \
      rm -f /etc/nginx/conf.d/*; \
-      cp docker/config/nginx/sapl.conf /etc/nginx/conf.d/sapl.conf; \
+      cp docker/config/nginx/sapl.conf /etc/nginx/conf.d/sapl.conf.template; \
      cp docker/config/nginx/nginx.conf /etc/nginx/nginx.conf; \
      if [ -f "docker/geoip/GeoLite2-ASN.mmdb" ]; then \
        cp docker/geoip/GeoLite2-ASN.mmdb /etc/nginx/geoip/GeoLite2-ASN.mmdb; \
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@ -88,7 +88,6 @@ services:
      TZ: America/Sao_Paulo
      REDIS_URL: redis://saplredis:6379
      CACHE_BACKEND: redis
-      RATELIMIT_DRY_RUN: 'False'
    volumes:
      - sapl_data:/var/interlegis/sapl/data
      - sapl_media:/var/interlegis/sapl/media
--- a/docker/k8s/README.md
+++ b/docker/k8s/README.md
@ -1,228 +0,0 @@
-# SAPL — Kubernetes Redis
-
-Manifests for the shared Redis instance used by all SAPL pods for
-cross-pod rate limiting (DB 1) and view/static-file caching (DB 0).
-
---
-
-## Directory layout
-
-```
-docker/k8s/
-├── redis-configmap.yaml    # redis.conf — no persistence, allkeys-lru, 5 GB ceiling
-├── redis-deployment.yaml   # Deployment (1 replica, redis:7-alpine)
-├── redis-service.yaml      # ClusterIP service on port 6379
-└── README.md               # this file
-```
-
---
-
-## Prerequisites
-
- `kubectl` configured to talk to the target cluster.
- A `redis` namespace (created below if it doesn't exist).
-
---
-
-## Deploy
-
-```bash
-# 1. Create the namespace (idempotent)
-kubectl create namespace redis --dry-run=client -o yaml | kubectl apply -f -
-
-# 2. Apply all three manifests
-kubectl apply -f docker/k8s/redis-configmap.yaml
-kubectl apply -f docker/k8s/redis-deployment.yaml
-kubectl apply -f docker/k8s/redis-service.yaml
-
-# 3. Verify the pod is Running
-kubectl -n redis get pods -l app=sapl-redis
-```
-
-Expected output:
-```
-NAME                          READY   STATUS    RESTARTS   AGE
-sapl-redis-6d9f8b7c4d-xk2lm   1/1     Running   0          30s
-```
-
---
-
-## Wire a SAPL namespace to Redis
-
-```bash
-# Create the per-namespace Secret (one-off per tenant)
-kubectl create secret generic sapl-redis \
-  --namespace=<NAMESPACE> \
-  --from-literal=REDIS_URL="redis://sapl-redis.redis.svc.cluster.local:6379" \
-  --dry-run=client -o yaml | kubectl apply -f -
-
-# Ensure the waffle switch row exists (starts OFF)
-kubectl exec -n <NAMESPACE> deploy/sapl -- \
-  python manage.py waffle_switch REDIS_CACHE off --create
-
-# Enable Redis for this namespace
-kubectl exec -n <NAMESPACE> deploy/sapl -- \
-  python manage.py waffle_switch REDIS_CACHE on
-
-# Rolling restart so start.sh picks up the new switch value
-kubectl rollout restart deployment/sapl -n <NAMESPACE>
-kubectl rollout status  deployment/sapl -n <NAMESPACE>
-```
-
-### Fleet-wide rollout
-
-```bash
-kubectl get namespaces -l app=sapl -o name | sed 's|namespace/||' | \
-  xargs -P 10 -I{} kubectl exec -n {} deploy/sapl -- \
-    python manage.py waffle_switch REDIS_CACHE on --create
-
-kubectl get namespaces -l app=sapl -o name | sed 's|namespace/||' | \
-  xargs -P 5 -I{} kubectl rollout restart deployment/sapl -n {}
-```
-
-### Roll back (without removing the Secret)
-
-```bash
-kubectl exec -n <NAMESPACE> deploy/sapl -- \
-  python manage.py waffle_switch REDIS_CACHE off
-kubectl rollout restart deployment/sapl -n <NAMESPACE>
-```
-
---
-
-## Monitor
-
-### Pod and events
-
-```bash
-# Pod status
-kubectl -n redis get pods -l app=sapl-redis -o wide
-
-# Deployment events (useful right after apply)
-kubectl -n redis describe deployment sapl-redis
-
-# Pod events (OOMKill, restarts, etc.)
-kubectl -n redis describe pod -l app=sapl-redis
-```
-
-### Logs
-
-```bash
-# Tail live logs
-kubectl -n redis logs -f deploy/sapl-redis
-
-# Last 100 lines
-kubectl -n redis logs deploy/sapl-redis --tail=100
-```
-
-### Redis INFO
-
-```bash
-# Memory usage
-kubectl exec -n redis deploy/sapl-redis -- \
-  redis-cli info memory \
-  | grep -E 'used_memory_human|maxmemory_human|mem_fragmentation_ratio'
-
-# Connection pressure
-kubectl exec -n redis deploy/sapl-redis -- \
-  redis-cli info stats \
-  | grep -E 'rejected_connections|instantaneous_ops_per_sec'
-
-# Key distribution per DB
-kubectl exec -n redis deploy/sapl-redis -- redis-cli info keyspace
-
-# Recent slow queries
-kubectl exec -n redis deploy/sapl-redis -- redis-cli slowlog get 10
-
-# Live command sampling (1-second window)
-kubectl exec -n redis deploy/sapl-redis -- redis-cli --latency-history -i 1
-```
-
-### Rate-limiter keys (DB 1)
-
-```bash
-kubectl exec -n redis deploy/sapl-redis -- \
-  redis-cli -n 1 dbsize
-
-kubectl exec -n redis deploy/sapl-redis -- \
-  redis-cli -n 1 --scan --pattern 'rl:ip:*' | head -20
-```
-
---
-
-## Seed the UA deny list (once after first deploy)
-
-```bash
-kubectl exec -n redis deploy/sapl-redis -- redis-cli -n 1 \
-  SADD rl:bot:ua:blocked \
-    "$(echo -n 'GPTBot'             | sha256sum | cut -d' ' -f1)" \
-    "$(echo -n 'ClaudeBot'          | sha256sum | cut -d' ' -f1)" \
-    "$(echo -n 'PerplexityBot'      | sha256sum | cut -d' ' -f1)" \
-    "$(echo -n 'Bytespider'         | sha256sum | cut -d' ' -f1)" \
-    "$(echo -n 'AhrefsBot'          | sha256sum | cut -d' ' -f1)" \
-    "$(echo -n 'meta-externalagent' | sha256sum | cut -d' ' -f1)"
-
-# Add a new offender at runtime (no restart required)
-kubectl exec -n redis deploy/sapl-redis -- redis-cli -n 1 \
-  SADD rl:bot:ua:blocked "$(echo -n 'NewBot/1.0' | sha256sum | cut -d' ' -f1)"
-```
-
---
-
-## Local standalone Redis (development / testing)
-
-No Kubernetes? Run Redis directly with Docker:
-
-```bash
-sudo docker run --rm -p 6379:6379 redis:7-alpine \
-  redis-server --save "" --appendonly no
-```
-
-Then point Django at it by exporting the env var before starting the dev server:
-
-```bash
-export REDIS_URL="redis://localhost:6379"
-export CACHE_BACKEND="redis"
-python manage.py runserver
-```
-
-Or add them to your local `.env` file:
-
-```
-REDIS_URL=redis://localhost:6379
-CACHE_BACKEND=redis
-```
-
-> **Note**: the waffle switch `REDIS_CACHE` must also be `on` in your local
-> database for `start.sh` to activate the Redis backend. Run:
-> ```bash
-> python manage.py waffle_switch REDIS_CACHE on --create
-> ```
-
---
-
-## Update `redis.conf` without redeploying
-
-```bash
-# Edit the ConfigMap
-kubectl -n redis edit configmap redis-config
-
-# Restart the pod to pick up the new config
-kubectl -n redis rollout restart deployment/sapl-redis
-```
-
---
-
-## Key schema reference
-
-| DB | Use case | Key pattern | TTL |
-|----|----------|-------------|-----|
-| 0 | Page / view cache | `sapl:cache:*` | 60 – 3 600 s |
-| 0 | Static file cache (logos) | `static:{ns}:{sha256}` | 3 – 24 h |
-| 0 | PDF cache (≤ 360 KB) | `file:{ns}:{sha256}` | 1 h |
-| 1 | IP rate-limit counter | `rl:ip:{ip}:reqs` | 60 s |
-| 1 | IP blocked marker | `rl:ip:{ip}:blocked` | 300 s |
-| 1 | User rate-limit counter | `rl:{ns}:user:{id}:reqs` | 60 s |
-| 1 | Path counter | `rl:{ns}:path:{sha256}:reqs` | 60 s |
-| 1 | UA deny list | `rl:bot:ua:blocked` | permanent SET |
-| 2 | Django Channels (future) | `channels:*` | session TTL |
--- a/docker/k8s/redis/redis-configmap.yaml
+++ b/docker/k8s/redis/redis-configmap.yaml
@ -2,7 +2,7 @@ apiVersion: v1
 kind: ConfigMap
 metadata:
  name: redis-config
-  namespace: redis
+  namespace: sapl-redis
 data:
  redis.conf: |
    save ""
--- a/docker/k8s/redis/redis-deployment.yaml
+++ b/docker/k8s/redis/redis-deployment.yaml
@ -2,7 +2,7 @@ apiVersion: apps/v1
 kind: Deployment
 metadata:
  name: sapl-redis
-  namespace: redis
+  namespace: sapl-redis
  labels:
    app: sapl-redis
 spec:
--- a/docker/k8s/redis/redis-service.yaml
+++ b/docker/k8s/redis/redis-service.yaml
@ -1,8 +1,8 @@
 apiVersion: v1
 kind: Service
 metadata:
-  name: sapl-redis
-  namespace: redis
+  name: redis
+  namespace: sapl-redis
  labels:
    app: sapl-redis
 spec:
--- a/docker/k8s/sapl-k8s.yaml
+++ b/docker/k8s/sapl-k8s.yaml
@ -189,16 +189,9 @@ spec:
          image: eribeiro/sapl:debug-k8s-1
          ports:
            - containerPort: 80
-          volumeMounts:
-            - name: data
-              mountPath: /var/interlegis/sapl/data
-              readOnly: true              # secrets are always mounted read-only
-      volumes:
-        - name: data
-          secret:
-            secretName: sapl-secretkey
-            defaultMode: 0440             # ensures read-only
          env:
+            - name: REDIS_URL
+              value: "redis://redis.sapl-redis.svc.cluster.local:6379"
            - name: ADMIN_PASSWORD
              value: "interlegis"
            - name: ADMIN_EMAIL
@ -214,5 +207,12 @@ spec:
            - name: EMAIL_HOST_USER
              value: "usuariosmtp"
            - name: EMAIL_SEND_USER
-
-
+          volumeMounts:
+            - name: data
+              mountPath: /var/interlegis/sapl/data
+              readOnly: true              # secrets are always mounted read-only
+      volumes:
+        - name: data
+          secret:
+            secretName: sapl-secretkey
+            defaultMode: 0440             # ensures read-only
--- a/docker/scripts/redis_populate_test_data.py
+++ b/docker/scripts/redis_populate_test_data.py
@ -1,176 +0,0 @@
-#!/usr/bin/env python3
-"""
-redis_populate_test_data.py — inject synthetic rate-limiter entries into Redis.
-
-Purpose: validate that RateLimitMiddleware reads the expected key schema,
-that Redis CLI / RedisInsight shows the right structure, and that blocking
-logic fires correctly without waiting for real traffic.
-
-Usage:
-    # Against docker-compose Redis (default)
-    python3 docker/scripts/redis_populate_test_data.py
-
-    # Against a different host/port
-    REDIS_URL=redis://localhost:6379 python3 docker/scripts/redis_populate_test_data.py
-
-    # Show what would be written without actually writing
-    DRY_RUN=1 python3 docker/scripts/redis_populate_test_data.py
-
-    # Clear all synthetic keys written by a previous run
-    CLEAR=1 python3 docker/scripts/redis_populate_test_data.py
-
-Key schema (DB 1 — rate limiter):
-    rl:ip:{ip}:reqs          INCR counter  — anonymous request count (TTL 60s)
-    rl:ip:{ip}:blocked       string "1"    — IP hard-blocked (TTL 300s)
-    rl:{ns}:user:{uid}:reqs  INCR counter  — auth user request count (TTL 60s)
-    rl:{ns}:user:{uid}:blocked string "1"  — user hard-blocked (TTL 300s)
-    rl:{ns}:ip:{ip}:w:{bucket} INCR        — namespace/IP sliding window (TTL 120s)
-"""
-
-import os
-import sys
-import time
-
-# ── dependency check ──────────────────────────────────────────────────────
-try:
-    import redis
-except ImportError:
-    print("ERROR: redis-py not installed.  Run: pip install redis", file=sys.stderr)
-    sys.exit(1)
-
-# ── config ────────────────────────────────────────────────────────────────
-REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379")
-RATELIMIT_DB = 1          # DB1 is the rate-limiter database
-DRY_RUN = os.environ.get("DRY_RUN", "0").lower() in ("1", "true", "yes")
-CLEAR = os.environ.get("CLEAR", "0").lower() in ("1", "true", "yes")
-
-# Synthetic values — tweak to exercise different code paths
-NAMESPACE = "sapl"        # POD_NAMESPACE value (hostname or k8s namespace)
-ANON_WINDOW = 60          # seconds — must match settings.RATE_LIMITER_RATE period
-AUTH_WINDOW = 60
-BLOCK_TTL = 300
-
-TEST_IPS = [
-    "203.0.113.1",   # below threshold (20 reqs)
-    "203.0.113.2",   # AT threshold (35 reqs — should trigger block)
-    "203.0.113.3",   # already blocked
-    "203.0.113.4",   # namespace/window counter near threshold
-]
-
-TEST_USERS = [
-    {"uid": "42",  "reqs": 50,   "blocked": False},   # normal auth user
-    {"uid": "99",  "reqs": 120,  "blocked": False},   # AT auth threshold
-    {"uid": "7",   "reqs": 10,   "blocked": True},    # pre-blocked user
-]
-
-# ── helpers ───────────────────────────────────────────────────────────────
-
-def key_ip_reqs(ip):
-    return f"rl:ip:{ip}:reqs"
-
-def key_ip_blocked(ip):
-    return f"rl:ip:{ip}:blocked"
-
-def key_user_reqs(ns, uid):
-    return f"rl:{ns}:user:{uid}:reqs"
-
-def key_user_blocked(ns, uid):
-    return f"rl:{ns}:user:{uid}:blocked"
-
-def key_ns_window(ns, ip, bucket):
-    return f"rl:{ns}:ip:{ip}:w:{bucket}"
-
-
-def write(r, key, value, ttl, label):
-    if DRY_RUN:
-        print(f"  [dry-run] SET {key!r} = {value!r}  EX {ttl}  ({label})")
-        return
-    if isinstance(value, int):
-        pipe = r.pipeline()
-        pipe.set(key, value, ex=ttl)
-        pipe.execute()
-    else:
-        r.set(key, value, ex=ttl)
-    print(f"  SET  {key!r} = {value!r}  EX {ttl}s  ({label})")
-
-
-def delete_pattern(r, pattern):
-    keys = r.keys(pattern)
-    if keys:
-        r.delete(*keys)
-        print(f"  DEL  {len(keys)} keys matching {pattern!r}")
-    else:
-        print(f"  (no keys matching {pattern!r})")
-
-
-# ── main ──────────────────────────────────────────────────────────────────
-
-def main():
-    r = redis.from_url(REDIS_URL, db=RATELIMIT_DB, decode_responses=True)
-    try:
-        r.ping()
-    except redis.ConnectionError as exc:
-        print(f"ERROR: cannot connect to Redis at {REDIS_URL}: {exc}", file=sys.stderr)
-        sys.exit(1)
-
-    print(f"Redis: {REDIS_URL}  DB={RATELIMIT_DB}  dry_run={DRY_RUN}  clear={CLEAR}")
-    print()
-
-    # ── clear mode ────────────────────────────────────────────────────────
-    if CLEAR:
-        print("=== Clearing synthetic test keys ===")
-        for ip in TEST_IPS:
-            delete_pattern(r, f"rl:ip:{ip}:*")
-            delete_pattern(r, f"rl:{NAMESPACE}:ip:{ip}:*")
-        for u in TEST_USERS:
-            delete_pattern(r, f"rl:{NAMESPACE}:user:{u['uid']}:*")
-        print("Done.")
-        return
-
-    # ── anonymous IP counters ─────────────────────────────────────────────
-    print("=== Anonymous IP request counters (DB1) ===")
-    write(r, key_ip_reqs(TEST_IPS[0]),  20,  ANON_WINDOW, "below threshold")
-    write(r, key_ip_reqs(TEST_IPS[1]),  35,  ANON_WINDOW, "AT threshold → middleware will block on next req")
-    write(r, key_ip_reqs(TEST_IPS[3]),  30,  ANON_WINDOW, "below threshold")
-    print()
-
-    # ── blocked IPs ───────────────────────────────────────────────────────
-    print("=== Blocked IPs (DB1) ===")
-    write(r, key_ip_blocked(TEST_IPS[2]), "1", BLOCK_TTL, "hard-blocked")
-    print()
-
-    # ── namespace/IP sliding window ───────────────────────────────────────
-    print("=== Namespace/IP sliding window (DB1) ===")
-    bucket = int(time.time() // ANON_WINDOW)
-    write(r, key_ns_window(NAMESPACE, TEST_IPS[3], bucket), 34, ANON_WINDOW * 2,
-          "near window threshold (next req triggers ua_rotation block)")
-    print()
-
-    # ── authenticated user counters ───────────────────────────────────────
-    print("=== Authenticated user request counters (DB1) ===")
-    for u in TEST_USERS:
-        if not u["blocked"]:
-            write(r, key_user_reqs(NAMESPACE, u["uid"]), u["reqs"], AUTH_WINDOW,
-                  f"uid={u['uid']} reqs={u['reqs']}")
-    print()
-
-    # ── blocked users ─────────────────────────────────────────────────────
-    print("=== Blocked users (DB1) ===")
-    for u in TEST_USERS:
-        if u["blocked"]:
-            write(r, key_user_blocked(NAMESPACE, u["uid"]), "1", BLOCK_TTL,
-                  f"uid={u['uid']} hard-blocked")
-    print()
-
-    # ── summary ───────────────────────────────────────────────────────────
-    if not DRY_RUN:
-        all_keys = r.keys("rl:*")
-        print(f"=== DB{RATELIMIT_DB} now contains {len(all_keys)} rl:* keys ===")
-        for k in sorted(all_keys):
-            ttl = r.ttl(k)
-            val = r.get(k)
-            print(f"  {k!r:55s}  val={val!r:5}  ttl={ttl}s")
-
-
-if __name__ == "__main__":
-    main()
--- a/docker/startup_scripts/start.sh
+++ b/docker/startup_scripts/start.sh
@ -107,6 +107,12 @@ write_env_file() {
  : "${REDIS_URL:=}"
  : "${CACHE_BACKEND:=file}"
  : "${POD_NAMESPACE:=sapl}"
+  # nginx burst defaults: 2× the zone's sustained rate (30r/m and 10r/m).
+  # Raise these if legitimate users hit 429 before the Django threshold.
+  : "${NGINX_BURST_GENERAL:=60}"
+  : "${NGINX_BURST_API:=60}"
+  : "${NGINX_BURST_HEAVY:=20}"
+  export NGINX_BURST_GENERAL NGINX_BURST_API NGINX_BURST_HEAVY

  tmp="$(mktemp)"
  {
@ -132,6 +138,9 @@ write_env_file() {
    printf 'REDIS_URL=%s\n' "$REDIS_URL"
    printf 'CACHE_BACKEND=%s\n' "$CACHE_BACKEND"
    printf 'POD_NAMESPACE=%s\n' "$POD_NAMESPACE"
+    printf 'NGINX_BURST_GENERAL=%s\n' "$NGINX_BURST_GENERAL"
+    printf 'NGINX_BURST_API=%s\n' "$NGINX_BURST_API"
+    printf 'NGINX_BURST_HEAVY=%s\n' "$NGINX_BURST_HEAVY"
  } > "$tmp"

  chmod 600 "$tmp"
@ -285,69 +294,28 @@ resolve_pod_namespace() {
 }

 # ---------------------------------------------------------------------------
-# Redis — resolve URL, check waffle switch, wait for connectivity
+# Redis — check URL from deployment env, waffle switch, connectivity
 # ---------------------------------------------------------------------------

-# 1. Populate REDIS_URL from local Secret (envFrom) or fall back to global
-#    cluster Secret read via the k8s API.
+# 1. Log whether REDIS_URL was provided via the deployment env.
 resolve_redis_url() {
-  # Already injected by pod's envFrom (local namespace Secret) — highest precedence.
-  [[ -n "${REDIS_URL:-}" ]] && { log "REDIS_URL from local secret."; return 0; }
-
-  # Try the global cluster Secret via the k8s in-cluster API.
-  local api="https://kubernetes.default.svc"
-  local token_file="/var/run/secrets/kubernetes.io/serviceaccount/token"
-  local ca="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
-
-  [[ -f "$token_file" ]] || { log "No k8s service-account token — skipping global Redis secret."; return 0; }
-
-  local token url
-  token="$(<"$token_file")"
-  url=$(curl -sf --cacert "$ca" \
-      -H "Authorization: Bearer $token" \
-      "${api}/api/v1/namespaces/interlegis-infra/secrets/sapl-global-redis" \
-    | python3 -c "
-import sys, json, base64
-d = json.load(sys.stdin).get('data', {})
-v = d.get('REDIS_URL', '')
-print(base64.b64decode(v).decode() if v else '')
-" 2>/dev/null || echo "")
-
-  if [[ -n "$url" ]]; then
-    export REDIS_URL="$url"
-    log "REDIS_URL from global cluster secret."
+  if [[ -n "${REDIS_URL:-}" ]]; then
+    log "REDIS_URL set: $REDIS_URL"
  else
-    log "No REDIS_URL found — file-based cache will be used."
+    log "REDIS_URL not set — file-based cache will be used."
  fi
 }

-# 2. Check the REDIS_CACHE waffle switch; set CACHE_BACKEND accordingly.
-resolve_cache_backend() {
-  [[ -z "${REDIS_URL:-}" ]] && return 0
-  log "REDIS_URL set — checking REDIS_CACHE waffle switch..."
-  local active
-  active=$(psql "$DATABASE_URL" -At -v ON_ERROR_STOP=0 \
-      -c "SELECT active FROM waffle_switch WHERE name='REDIS_CACHE' LIMIT 1;" \
-      2>/dev/null || echo "")
-  if [[ "$active" == "t" ]]; then
-    export CACHE_BACKEND="redis"
-    log "REDIS_CACHE switch ON — activating Redis cache backend."
-  else
-    export CACHE_BACKEND="file"
-    log "REDIS_CACHE switch OFF — using file-based cache."
-  fi
-}
-
-# 3. Ensure the REDIS_CACHE waffle switch row exists (default: off).
-#    Uses get_or_create so the value is only set on first creation —
-#    subsequent restarts do NOT overwrite what the operator configured.
-#    (waffle_switch … off --create always writes off, breaking manual flips.)
+# 2. Create/reset the REDIS_CACHE waffle switch; set CACHE_BACKEND accordingly.
 configure_redis_cache() {
-  [[ -z "${REDIS_URL:-}" ]] && return 0
-  log "Ensuring REDIS_CACHE waffle switch exists (default: off)..."
-  python3 manage.py shell -c \
-    "from waffle.models import Switch; Switch.objects.get_or_create(name='REDIS_CACHE', defaults={'active': False})" \
-    || true
+  ./manage.py waffle_switch REDIS_CACHE off --create || true
+  if [[ -z "${REDIS_URL:-}" ]]; then
+    log "REDIS_URL not set — REDIS_CACHE switch OFF."
+    return 0
+  fi
+  ./manage.py waffle_switch REDIS_CACHE on --create || true
+  export CACHE_BACKEND="redis"
+  log "REDIS_URL set — REDIS_CACHE switch ON."
 }

 # 4. Block until Redis is reachable (or give up gracefully).
@ -374,6 +342,10 @@ wait_for_redis() {
 start_services() {
  log "Starting gunicorn..."
  gunicorn -c gunicorn.conf.py &
+  log "Applying nginx config (burst: general=${NGINX_BURST_GENERAL} api=${NGINX_BURST_API} heavy=${NGINX_BURST_HEAVY})..."
+  envsubst '${NGINX_BURST_GENERAL} ${NGINX_BURST_API} ${NGINX_BURST_HEAVY}' \
+    < /etc/nginx/conf.d/sapl.conf.template \
+    > /etc/nginx/conf.d/sapl.conf
  log "Starting nginx..."
  exec /usr/sbin/nginx -g "daemon off;"
 }
@ -386,7 +358,6 @@ main() {
  configure_pg_timezone
  migrate_db
  configure_redis_cache
-  resolve_cache_backend
  wait_for_redis
  write_env_file          # writes resolved REDIS_URL + CACHE_BACKEND into .env
  configure_solr || true
--- a/rate-limiter-v2.md
+++ b/rate-limiter-v2.md
--- a/sapl/middleware/test_ratelimiter.py
+++ b/sapl/middleware/test_ratelimiter.py
@ -0,0 +1,385 @@
+"""
+Unit tests for sapl/middleware/ratelimit.py.
+
+No database access is needed — all tests use RequestFactory and mocks.
+Redis is never contacted; _incr_with_ttl is either mocked directly on the
+middleware instance or the fallback non-atomic path is exercised via the
+mock cache.
+"""
+
+import pytest
+from unittest.mock import MagicMock, patch
+from django.test import RequestFactory
+
+from sapl.middleware.ratelimit import (
+    _NAMESPACE,
+    _is_suspicious_headers,
+    _parse_rate,
+    get_client_ip,
+    make_ratelimit_cache_key,
+    RateLimitMiddleware,
+    RL_IP_BLOCKED,
+    RL_USER_BLOCKED,
+    smart_key,
+    smart_rate,
+)
+
+# ---------------------------------------------------------------------------
+# Shared test helpers
+# ---------------------------------------------------------------------------
+
+_factory = RequestFactory()
+
+# Headers that a normal browser would send — used as the default baseline.
+_NORMAL_HEADERS = {
+    'HTTP_ACCEPT': 'text/html,application/xhtml+xml',
+    'HTTP_ACCEPT_LANGUAGE': 'pt-BR,pt;q=0.9',
+}
+
+
+def _req(ip='1.2.3.4', ua='Mozilla/5.0', path='/', extra_meta=None):
+    """GET request with sensible defaults and browser-like headers."""
+    request = _factory.get(path)
+    request.META.update({'REMOTE_ADDR': ip, 'HTTP_USER_AGENT': ua, **_NORMAL_HEADERS})
+    if extra_meta:
+        request.META.update(extra_meta)
+    return request
+
+
+def _anon_req(**kwargs):
+    r = _req(**kwargs)
+    r.user = MagicMock(is_authenticated=False)
+    return r
+
+
+def _auth_req(uid=7, **kwargs):
+    r = _req(**kwargs)
+    r.user = MagicMock(is_authenticated=True, pk=uid)
+    return r
+
+
+def _make_middleware(whitelist=None, anon_rate='35/m', auth_rate='120/m'):
+    """
+    Return (middleware, mock_cache).
+
+    The ratelimit cache is replaced with a MagicMock whose .get() returns None
+    by default (nothing blocked, no counters set).  Tests may replace
+    mock_cache.get.side_effect or mock mw._incr_with_ttl directly.
+
+    sapl.middleware.ratelimit imports settings as `from sapl import settings`
+    (a direct module reference), so django.test.override_settings has no effect
+    on it.  We patch the name in the ratelimit module's namespace instead.
+    """
+    mock_cache = MagicMock()
+    mock_cache.get.return_value = None
+    get_response = MagicMock(return_value=MagicMock(status_code=200))
+
+    mock_settings = MagicMock()
+    mock_settings.RATE_LIMITER_RATE = anon_rate
+    mock_settings.RATE_LIMITER_RATE_AUTHENTICATED = auth_rate
+    mock_settings.RATE_LIMITER_RATE_BOT = '5/m'
+    mock_settings.RATE_LIMIT_WHITELIST_IPS = whitelist or []
+    mock_settings.POD_NAMESPACE = _NAMESPACE  # keep module-level _NAMESPACE consistent
+
+    with (
+        patch('sapl.middleware.ratelimit.caches') as mock_caches,
+        patch('sapl.middleware.ratelimit.settings', mock_settings),
+    ):
+        mock_caches.__getitem__.return_value = mock_cache
+        mw = RateLimitMiddleware(get_response)
+        # __init__ already set mw._rl_cache = caches['ratelimit'] == mock_cache,
+        # but reassign explicitly so tests have a direct handle on the same object.
+        mw._rl_cache = mock_cache
+    return mw, mock_cache
+
+
+# ---------------------------------------------------------------------------
+# _parse_rate
+# ---------------------------------------------------------------------------
+
+@pytest.mark.parametrize('rate_str,expected', [
+    ('35/m', (35, 60)),
+    ('120/m', (120, 60)),
+    ('10/s', (10, 1)),
+    ('5/h', (5, 3600)),
+    ('1/M', (1, 60)),   # period is case-insensitive
+])
+def test_parse_rate(rate_str, expected):
+    assert _parse_rate(rate_str) == expected
+
+
+# ---------------------------------------------------------------------------
+# make_ratelimit_cache_key — pass-through, no mangling
+# ---------------------------------------------------------------------------
+
+def test_make_ratelimit_cache_key_passthrough():
+    assert make_ratelimit_cache_key('rl:ip:1.2.3.4:reqs', 'some_prefix', 1) == 'rl:ip:1.2.3.4:reqs'
+    assert make_ratelimit_cache_key('rl:abc123', '', 99) == 'rl:abc123'
+
+
+# ---------------------------------------------------------------------------
+# _is_suspicious_headers
+# ---------------------------------------------------------------------------
+
+def test_suspicious_both_headers_missing():
+    r = _factory.get('/')
+    r.META.pop('HTTP_ACCEPT', None)
+    r.META.pop('HTTP_ACCEPT_LANGUAGE', None)
+    assert _is_suspicious_headers(r) is True
+
+
+def test_suspicious_one_header_missing_is_not_suspicious():
+    """Only flagged when *both* headers are absent."""
+    r = _factory.get('/')
+    r.META['HTTP_ACCEPT'] = 'text/html'
+    r.META.pop('HTTP_ACCEPT_LANGUAGE', None)
+    assert _is_suspicious_headers(r) is False
+
+
+def test_suspicious_both_headers_present():
+    r = _factory.get('/')
+    r.META['HTTP_ACCEPT'] = 'text/html'
+    r.META['HTTP_ACCEPT_LANGUAGE'] = 'pt-BR'
+    assert _is_suspicious_headers(r) is False
+
+
+# ---------------------------------------------------------------------------
+# get_client_ip — header priority and XFF chain
+# ---------------------------------------------------------------------------
+
+def test_get_client_ip_remote_addr():
+    r = _factory.get('/')
+    r.META['REMOTE_ADDR'] = '10.0.0.1'
+    assert get_client_ip(r) == '10.0.0.1'
+
+
+def test_get_client_ip_xff_single():
+    r = _factory.get('/')
+    r.META['HTTP_X_FORWARDED_FOR'] = '203.0.113.5'
+    assert get_client_ip(r) == '203.0.113.5'
+
+
+def test_get_client_ip_xff_chain_uses_leftmost():
+    """The leftmost IP in XFF is the real client; the rest are proxies."""
+    r = _factory.get('/')
+    r.META['HTTP_X_FORWARDED_FOR'] = '203.0.113.5, 10.0.0.1, 10.0.0.2'
+    assert get_client_ip(r) == '203.0.113.5'
+
+
+def test_get_client_ip_x_real_ip_used_when_no_xff():
+    r = _factory.get('/')
+    r.META['REMOTE_ADDR'] = '127.0.0.1'
+    r.META['HTTP_X_REAL_IP'] = '203.0.113.9'
+    assert get_client_ip(r) == '203.0.113.9'
+
+
+def test_get_client_ip_xff_preferred_over_x_real_ip():
+    r = _factory.get('/')
+    r.META['HTTP_X_FORWARDED_FOR'] = '203.0.113.1'
+    r.META['HTTP_X_REAL_IP'] = '203.0.113.2'
+    assert get_client_ip(r) == '203.0.113.1'
+
+
+# ---------------------------------------------------------------------------
+# smart_key / smart_rate
+# ---------------------------------------------------------------------------
+
+def test_smart_key_anon_returns_masked_ip():
+    r = _anon_req(ip='5.5.5.5')
+    assert smart_key(None, r) == '5.5.5.5'
+
+
+def test_smart_key_auth_returns_pk_string():
+    r = _auth_req(uid=42, ip='5.5.5.5')
+    assert smart_key(None, r) == '42'
+
+
+def test_smart_rate_anon_returns_anon_rate():
+    with patch('sapl.middleware.ratelimit.settings') as mock_s:
+        mock_s.RATE_LIMITER_RATE = '35/m'
+        mock_s.RATE_LIMITER_RATE_AUTHENTICATED = '120/m'
+        assert smart_rate(None, _anon_req()) == '35/m'
+
+
+def test_smart_rate_auth_returns_auth_rate():
+    with patch('sapl.middleware.ratelimit.settings') as mock_s:
+        mock_s.RATE_LIMITER_RATE = '35/m'
+        mock_s.RATE_LIMITER_RATE_AUTHENTICATED = '120/m'
+        assert smart_rate(None, _auth_req()) == '120/m'
+
+
+# ---------------------------------------------------------------------------
+# RateLimitMiddleware — whitelisted IP bypasses everything (including bad UA)
+# ---------------------------------------------------------------------------
+
+def test_whitelist_bypasses_all_checks():
+    mw, mock_cache = _make_middleware(whitelist=['1.2.3.4'])
+    result = mw._evaluate(_anon_req(ip='1.2.3.4', ua='GPTBot/1.0'))
+    assert result == {'action': 'pass', 'ip': '1.2.3.4'}
+    mock_cache.get.assert_not_called()
+
+
+# ---------------------------------------------------------------------------
+# Check 1 — known bot User-Agent
+# ---------------------------------------------------------------------------
+
+@pytest.mark.parametrize('ua', [
+    'GPTBot/1.0',
+    'Mozilla/5.0 (compatible; ClaudeBot/1.0)',
+    'PerplexityBot',
+    'Bytespider',
+    'AhrefsBot/7.0',
+    'meta-externalagent/1.1',
+    'OAI-SearchBot',
+    'Mozilla/5.0 (compatible; bingbot/2.0)',
+    'SERankingBacklinksBot/1.0',
+    'Mozilla/5.0 AppleWebKit Chrome/98.0.4758.80',
+])
+def test_known_bot_ua_blocked(ua):
+    mw, _ = _make_middleware()
+    result = mw._evaluate(_anon_req(ua=ua))
+    assert result == {'action': 'block', 'reason': 'known_ua', 'ip': '1.2.3.4'}
+
+
+def test_bot_ua_check_is_case_insensitive():
+    mw, _ = _make_middleware()
+    result = mw._evaluate(_anon_req(ua='gptbot/2.0'))
+    assert result['reason'] == 'known_ua'
+
+
+# ---------------------------------------------------------------------------
+# Check 2 — IP already blocked in cache
+# ---------------------------------------------------------------------------
+
+def test_ip_blocked_in_cache():
+    mw, mock_cache = _make_middleware()
+    ip = '1.2.3.4'
+    mock_cache.get.side_effect = lambda key: 1 if key == RL_IP_BLOCKED.format(ip=ip) else None
+    result = mw._evaluate(_anon_req(ip=ip))
+    assert result == {'action': 'block', 'reason': 'ip_blocked', 'ip': ip}
+
+
+# ---------------------------------------------------------------------------
+# Check 3a — authenticated user blocked in cache
+# ---------------------------------------------------------------------------
+
+def test_auth_user_blocked_in_cache():
+    mw, mock_cache = _make_middleware()
+    uid = '7'
+    mock_cache.get.side_effect = lambda key: (
+        1 if key == RL_USER_BLOCKED.format(ns=_NAMESPACE, uid=uid) else None
+    )
+    result = mw._evaluate(_auth_req(uid=int(uid)))
+    assert result == {'action': 'block', 'reason': 'user_blocked', 'ip': '1.2.3.4'}
+
+
+# ---------------------------------------------------------------------------
+# Check 3b — authenticated + suspicious headers
+# ---------------------------------------------------------------------------
+
+def test_auth_suspicious_headers_blocked():
+    mw, _ = _make_middleware()
+    r = _auth_req()
+    r.META.pop('HTTP_ACCEPT', None)
+    r.META.pop('HTTP_ACCEPT_LANGUAGE', None)
+    result = mw._evaluate(r)
+    assert result == {'action': 'block', 'reason': 'suspicious_headers_auth', 'ip': '1.2.3.4'}
+
+
+# ---------------------------------------------------------------------------
+# Check 3c — authenticated request rate
+# ---------------------------------------------------------------------------
+
+def test_auth_rate_exceeded_blocks_and_marks_user_blocked():
+    mw, mock_cache = _make_middleware(auth_rate='5/m')
+    mw._incr_with_ttl = MagicMock(return_value=5)  # exactly at threshold
+    result = mw._evaluate(_auth_req(uid=7))
+    assert result == {'action': 'block', 'reason': 'auth_user_rate', 'ip': '1.2.3.4'}
+    mock_cache.set.assert_called_once_with(
+        RL_USER_BLOCKED.format(ns=_NAMESPACE, uid='7'),
+        1,
+        timeout=RateLimitMiddleware.BLOCK_TTL,
+    )
+
+
+def test_auth_under_rate_passes():
+    mw, mock_cache = _make_middleware(auth_rate='5/m')
+    mw._incr_with_ttl = MagicMock(return_value=4)  # one below threshold
+    result = mw._evaluate(_auth_req(uid=7))
+    assert result == {'action': 'pass', 'ip': '1.2.3.4'}
+    mock_cache.set.assert_not_called()
+
+
+# ---------------------------------------------------------------------------
+# Check 4a — anonymous + suspicious headers
+# ---------------------------------------------------------------------------
+
+def test_anon_suspicious_headers_blocked():
+    mw, _ = _make_middleware()
+    r = _anon_req()
+    r.META.pop('HTTP_ACCEPT', None)
+    r.META.pop('HTTP_ACCEPT_LANGUAGE', None)
+    result = mw._evaluate(r)
+    assert result == {'action': 'block', 'reason': 'suspicious_headers', 'ip': '1.2.3.4'}
+
+
+# ---------------------------------------------------------------------------
+# Check 4b — anonymous IP request rate
+# ---------------------------------------------------------------------------
+
+def test_anon_ip_rate_exceeded_blocks_and_marks_ip_blocked():
+    mw, mock_cache = _make_middleware(anon_rate='5/m')
+    mw._incr_with_ttl = MagicMock(return_value=5)  # first call (IP counter) hits threshold
+    result = mw._evaluate(_anon_req())
+    assert result == {'action': 'block', 'reason': 'ip_rate', 'ip': '1.2.3.4'}
+    mock_cache.set.assert_called_once_with(
+        RL_IP_BLOCKED.format(ip='1.2.3.4'),
+        1,
+        timeout=RateLimitMiddleware.BLOCK_TTL,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Check 4c — per-namespace/IP/window (UA rotation detection)
+# ---------------------------------------------------------------------------
+
+def test_anon_ua_rotation_detected_blocks_and_marks_ip_blocked():
+    mw, mock_cache = _make_middleware(anon_rate='5/m')
+    # First call (IP counter) is under threshold; second (window counter) hits it.
+    mw._incr_with_ttl = MagicMock(side_effect=[4, 5])
+    result = mw._evaluate(_anon_req())
+    assert result == {'action': 'block', 'reason': 'ua_rotation', 'ip': '1.2.3.4'}
+    mock_cache.set.assert_called_once_with(
+        RL_IP_BLOCKED.format(ip='1.2.3.4'),
+        1,
+        timeout=RateLimitMiddleware.BLOCK_TTL,
+    )
+
+
+def test_anon_under_all_thresholds_passes():
+    mw, mock_cache = _make_middleware(anon_rate='5/m')
+    mw._incr_with_ttl = MagicMock(return_value=4)  # both counters below threshold
+    result = mw._evaluate(_anon_req())
+    assert result == {'action': 'pass', 'ip': '1.2.3.4'}
+    mock_cache.set.assert_not_called()
+
+
+# ---------------------------------------------------------------------------
+# __call__ — block returns 429, pass forwards to get_response
+# ---------------------------------------------------------------------------
+
+def test_call_block_returns_429_with_retry_after_header():
+    mw, _ = _make_middleware()
+    mw._evaluate = MagicMock(return_value={'action': 'block', 'reason': 'known_ua', 'ip': '1.2.3.4'})
+    response = mw(_factory.get('/'))
+    assert response.status_code == 429
+    assert response['Retry-After'] == str(RateLimitMiddleware.BLOCK_TTL)
+    mw.get_response.assert_not_called()
+
+
+def test_call_pass_forwards_request_to_get_response():
+    mw, _ = _make_middleware()
+    mw._evaluate = MagicMock(return_value={'action': 'pass', 'ip': '1.2.3.4'})
+    request = _anon_req()
+    mw(request)
+    mw.get_response.assert_called_once_with(request)
--- a/scripts/test_ratelimiter.py
+++ b/scripts/test_ratelimiter.py
@ -0,0 +1,83 @@
+#!/usr/bin/env python3
+"""
+Script to test rate limiting of an endpoint.
+"""
+
+import argparse
+import time
+import requests
+from collections import defaultdict
+from urllib.parse import urlparse
+
+
+def test_rate_limiter(url, num_requests=50, delay=0.1, timeout=10):
+    """Send multiple requests and analyze rate limiting behavior."""
+    parsed = urlparse(url)
+    if not parsed.scheme or not parsed.netloc:
+        raise ValueError(
+            "URL must include a protocol and host, e.g. http://localhost or https://example.com"
+        )
+    if parsed.scheme not in {"http", "https"}:
+        raise ValueError("Unsupported URL scheme: %s. Use http or https." % parsed.scheme)
+
+    status_counts = defaultdict(int)
+    response_times = []
+    first_rate_limited_at = None
+    attempted_requests = 0
+    
+    print(f"Testing rate limiter on: {url}")
+    print(f"Number of requests: {num_requests}")
+    print(f"Delay between requests: {delay}s")
+    print("-" * 50)
+    
+    for i in range(num_requests):
+        attempted_requests += 1
+        try:
+            start_time = time.time()
+            response = requests.get(url, timeout=timeout)
+            elapsed = time.time() - start_time
+            
+            status_counts[response.status_code] += 1
+            response_times.append(elapsed)
+            
+            print(f"Request {i+1:3d}: Status {response.status_code} | Time: {elapsed:.3f}s")
+            
+            if response.status_code == 429:
+                if first_rate_limited_at is None:
+                    first_rate_limited_at = i + 1
+                print(f"  -> Rate limited on request {i+1}")
+                break
+                
+        except requests.exceptions.RequestException as e:
+            print(f"Request {i+1:3d}: Error - {e}")
+            status_counts['ERROR'] += 1
+        
+        if i < num_requests - 1:
+            time.sleep(delay)
+    
+    print("-" * 50)
+    print("\nSummary:")
+    print(f"  Total requests attempted: {attempted_requests}")
+    print(f"  Successful (200):          {status_counts.get(200, 0)}")
+    print(f"  Rate limited (429):        {status_counts.get(429, 0)}")
+    if first_rate_limited_at is not None:
+        print(f"  First 429 occurred at request: {first_rate_limited_at}")
+    print(f"  Other errors:              {sum(v for k, v in status_counts.items() if k not in [200, 429, 'ERROR'])}")
+    
+    if response_times:
+        avg_time = sum(response_times) / len(response_times)
+        print(f"\nAverage response time: {avg_time:.3f}s")
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Test rate limiter of a URL")
+    parser.add_argument(
+        "url",
+        help="URL to test, including protocol (http:// or https://)",
+    )
+    parser.add_argument("-n", "--num-requests", type=int, default=50, help="Number of requests")
+    parser.add_argument("-d", "--delay", type=float, default=0.1, help="Delay between requests (seconds)")
+    parser.add_argument("-t", "--timeout", type=int, default=10, help="Request timeout (seconds)")
+    
+    args = parser.parse_args()
+    test_rate_limiter(args.url, args.num_requests, args.delay, args.timeout)
--- a/scripts/test_ratelimiter.sh
+++ b/scripts/test_ratelimiter.sh
@ -1,14 +0,0 @@
-#!/bin/bash
-
-#URL=http://localhost:8000/materia/4379
-#URL=http://localhost:8000/norma/pesquisar
-#URL=http://localhost/norma/pesquisar
-#URL=https://sapl31demo.interlegis.leg.br/docadm/45
-#URL=https://sapl.joaopessoa.pb.leg.br/materia/186300
-#URL=http://localhost:8000/materia/4379/materiaassunto
-#URL=http://localhost:8000/sessao/4984
-URL="http://localhost:8000/docadm/pesq-doc-adm?tipo=&o=&numero=&complemento=&ano=&protocolo__numero=&numero_externo=&data_0=&data_1=&interessado=&assunto=&tramitacao=&tramitacaoadministrativo__status=&tramitacaoadministrativo__unidade_tramitacao_destino=&pesquisar=Pesquisar"
-
-for i in $(seq 1 12); do
-   curl -sS -o /dev/null -w "req=$i http=%{http_code} time=%{time_total}\n" "$URL"
-done