# SAPL — Rate Limiter & Redis Operations > **Scope**: Django / Gunicorn / nginx / Kubernetes fleet of 1,200+ pods. > Each pod has a dedicated PostgreSQL instance. A K8s Ingress sits in front of all tenants. > **This document is canonical** — all earlier session notes are consolidated here. --- ## Context & Problem Statement ### Fleet | Item | Detail | |------|--------| | System | SAPL — Django 2.2, legislative management for Brazilian municipal chambers | | Fleet | ~1,200 Kubernetes pods, each with a dedicated PostgreSQL pod | | Pod limits | 1 core CPU (limit) / 35m (request) · 1600Mi RAM (limit) / 800Mi (request) | | Users | Legislative house staff, often behind NAT (many users, one public IP) | | Workloads | PDF generation (synchronous, ReportLab), file uploads up to 150 MB, WebSocket voting panel | ### OOM Kill Pattern Workers grow from ~35 MB at birth to 800–900 MB within 2–3 minutes, then are killed and replaced in a continuous cycle. Root causes: - Bot scraping triggers synchronous PDF generation — entire document built in RAM (ReportLab) - `worker_max_memory_per_child` only checks **between requests**; workers blocked on long requests are never recycled - `TIMEOUT=300` lets bots hold threads for up to 5 minutes while memory accumulates - 3 workers × 300 MB each = ~900 MB — breaching the 800Mi request threshold ### Bot Traffic Profile (Barueri pod, 16 days, 662 k requests) | Actor | Requests | % of total | |-------|----------|-----------| | Googlebot | ~154,000 | 23.2% | | Chrome/98.0.4758 (spoofed scraper) | 90,774 | 13.7% | | kube-probe (healthcheck) | 69,065 | 10.4% | | meta-externalagent | 28,325 | 4.3% | | GPTBot | 11,489 | 1.7% | | bingbot | 7,639 | 1.1% | | OAI-SearchBot + Applebot | 6,681 | 1.0% | | **Total identified bots** | **~377,000** | **~56.9%** | **Botnet fingerprint:** - Rotates User-Agents (Chrome/121, Chrome/122, Firefox/123, Safari/17…) across requests - Crawls all sub-endpoints of the same matéria within 1 second from different IPs - Distributes crawling across tenants — each pod stays under the per-pod rate limit, never triggering it - Primary targets: `/relatorios/{id}/etiqueta-materia-legislativa` (~40 KB PDF) and all `/materia/{id}/*` sub-endpoints ### Static File Traffic (from CSV analysis) | Category | Requests | Transfers | |----------|----------|----------| | Logos / images | 62,776 | ~24 GB | | PDFs | 8,869 | 5.1 GB | | Parliamentarian photos | 11,856 | ~0.5 GB | | **Total** | **83,501** | **~30 GB** | Top offender: `Brasão - Foz do Iguaçu.png` — 14,512 requests, 5.6 GB from a single 392 KB file. ### Hard Constraints | Constraint | Impact | |------------|--------| | Per-pod PostgreSQL | Rate-limit counters not shared across pods | | NAT environments | IP-based rate limiting causes false positives | | `TIMEOUT=300` / uploads to 150 MB | Must not be broken — intentional for slow workflows | --- ## Architecture Overview ### Component Diagram ```mermaid graph TD Client([Bot / Human Client]) nginx[nginx] gunicorn[Gunicorn\n2 workers / 4 threads] mw[Django Middleware\nRateLimitMiddleware] view[View Layer\nCBV + decorators] db0[(Redis DB0\npage cache)] db1[(Redis DB1\nrate limiter)] pg[(PostgreSQL\nper-pod)] fs[Filesystem\nPDFs / media] Client -->|HTTP| nginx nginx -->|proxy_pass| gunicorn gunicorn --> mw mw -->|pass| view mw -->|429| nginx view --> pg view --> fs view -->|read/write cached pages| db0 mw -->|counters + blocked markers| db1 ``` > DB2 is reserved for Django Channels (WebSocket — future). ### Redis Memory Budget | Key type | Key schema | TTL | DB | Est. size | |----------|-----------|-----|----|----------| | Page / view cache | `cache:{ns}:*` | 60–600 s | 0 | ~0.5 GB | | Static cache (images/logos) | `static:{ns}:{sha256}` | 3–24 h | 0 | ~2.4 GB | | IP request counter | `rl:ip:{ip}:reqs` | 60 s | 1 | ~0.6 MB | | IP blocked marker | `rl:ip:{ip}:blocked` | 300 s | 1 | ~0.06 MB | | User request counter | `rl:{ns}:user:{uid}:reqs` | 60 s | 1 | negligible | | User blocked marker | `rl:{ns}:user:{uid}:blocked` | 300 s | 1 | negligible | | Path counter | `rl:{ns}:path:{sha256}:reqs` | 60 s | 1 | ~0.3 MB | | UA deny list | `rl:bot:ua:blocked` | permanent SET | 1 | ~0.03 MB | | NS/IP/window counter | `rl:{ns}:ip:{ip}:w:{bucket}` | 120 s | 1 | ~0.6 MB | | Redis overhead (× 1.5) | | | | ~1.6 GB | | **Total ceiling** | | | | **~5 GB** | --- ## Decision Log | Decision | Chosen | Rationale | |----------|--------|-----------| | Redis topology | **Single pod** (no Sentinel, no Cluster) | 65 MB of active data fits comfortably; cluster complexity not justified | | PDF caching in Redis | **No** — ETags + sendfile are sufficient | Once rate limiting + ETags are active, repeat requests become 304s with zero bytes transferred | | HTTP conditional requests | **`ConditionalGetMiddleware` + `@condition` decorator** | `ConditionalGetMiddleware` handles ETag/304 for all views; `@condition(etag_func, last_modified_func)` on materia/norma detail views skips view execution entirely on cache hit | | Upload endpoint special-casing (nginx) | **Removed** — fall through to `location /` | No justification for separate `limit_req` zone; `location /` with `sapl_general` covers it | | Static asset cache policy | **90 min** (`expires 90m`, `max-age=5400`) | Conservative — safe with `collectstatic` content-hashed filenames; `immutable` not used (would require verified forever-hashed URLs) | | Rate-limit enforcement | **Django middleware** with shared Redis | No nginx image changes required; solves cross-pod consistency immediately | | `worker_max_memory_per_child` | **400 MB** | Pod limit 1600Mi, 2 workers × 400 MB = 800 MB — leaves 800 Mi headroom | | `sendfile off` → `on` | **Bug** — flip to `on` | No valid production reason found; disabling userspace copy is always correct | | `/media/` serving | **X-Accel-Redirect** | Routes all `/media/` through Gunicorn so Django middleware runs; nginx serves bytes via internal location | | Cache backend switch | **At pod startup** via `start.sh` + waffle switch | Pod restart is acceptable; avoids per-request runtime overhead | --- ## Directory layout ``` docker/k8s/ └── redis/ ├── redis-configmap.yaml # redis.conf — no persistence, allkeys-lru, 5 GB ceiling ├── redis-deployment.yaml # Deployment (1 replica, redis:7-alpine) └── redis-service.yaml # ClusterIP service on port 6379 ``` --- ## Prerequisites - `kubectl` configured to talk to the target cluster. - A `sapl-redis` namespace (created below if it doesn't exist). --- ## Deploy ```bash # 1. Create the namespace (idempotent) rancher kubectl create namespace sapl-redis --dry-run=client -o yaml | rancher kubectl apply -f - # 2. Apply all three manifests rancher kubectl apply -f docker/k8s/redis/redis-configmap.yaml rancher kubectl apply -f docker/k8s/redis/redis-deployment.yaml rancher kubectl apply -f docker/k8s/redis/redis-service.yaml # 3. Verify the pod is Running rancher kubectl -n sapl-redis get pods -l app=sapl-redis ``` Expected output: ``` NAME READY STATUS RESTARTS AGE sapl-redis-6d9f8b7c4d-xk2lm 1/1 Running 0 30s ``` --- ## Verify the rate limiter `scripts/test_ratelimiter.py` fires repeated GET requests at a SAPL URL and reports when the first 429 is returned. ### Usage ``` python scripts/test_ratelimiter.py [-n NUM] [-d DELAY] [-t TIMEOUT] ``` | Flag | Default | Meaning | |------|---------|---------| | `url` | *(required)* | Full URL including scheme, e.g. `http://localhost` | | `-n`, `--num-requests` | `50` | Maximum requests to send | | `-d`, `--delay` | `0.1` | Seconds between requests | | `-t`, `--timeout` | `10` | Per-request timeout in seconds | The script stops and prints a summary as soon as a 429 is received. ### Examples ```bash # Hit the anonymous threshold (35 req/min) — fire 40 requests with minimal delay python scripts/test_ratelimiter.py http://localhost -n 40 -d 0.05 # Slower fire — check that legitimate traffic is not rate-limited python scripts/test_ratelimiter.py http://localhost -n 20 -d 2 # Test against a staging pod via port-forward rancher kubectl port-forward -n deploy/sapl 8080:80 & python scripts/test_ratelimiter.py http://localhost:8080 -n 40 -d 0.05 ``` ### Reading the output ``` Request 1: Status 200 | Time: 0.045s ... Request 36: Status 429 | Time: 0.038s -> Rate limited on request 36 Summary: Total requests attempted: 36 Successful (200): 35 Rate limited (429): 1 First 429 occurred at request: 36 ``` A first-429 near the configured anonymous threshold (35 req/min) confirms the middleware is wired correctly. A first-429 much earlier points to nginx `limit_req` firing before Django sees the request. --- ## Inject REDIS_URL into SAPL instances `REDIS_URL` points at the shared instance: ``` redis://redis.sapl-redis.svc.cluster.local:6379 ^^^^^ ^^^^^^^^^^ svc namespace ``` `start.sh` picks it up on every pod startup and sets the `REDIS_CACHE` waffle switch automatically — no further intervention needed. ### Fleet-wide rollout Uses the `app.kubernetes.io/name=sapl` pod label to discover every SAPL namespace automatically — onboarding a new municipality requires no script changes. ```bash for ns in $(rancher kubectl get pods -A -l app.kubernetes.io/name=sapl \ -o jsonpath='{.items[*].metadata.namespace}' | tr ' ' '\n' | sort -u); do rancher kubectl set env deployment/sapl \ REDIS_URL=redis://redis.sapl-redis.svc.cluster.local:6379 \ -n $ns done ``` ### Roll back ```bash for ns in $(rancher kubectl get pods -A -l app.kubernetes.io/name=sapl \ -o jsonpath='{.items[*].metadata.namespace}' | tr ' ' '\n' | sort -u); do rancher kubectl set env deployment/sapl REDIS_URL- -n $ns done ``` `kubectl set env deployment/sapl REDIS_URL-` (trailing `-`) removes the variable. `start.sh` then falls back to file-based cache automatically. --- ## Monitor ### Pod and events ```bash # Pod status rancher kubectl -n sapl-redis get pods -l app=sapl-redis -o wide # Deployment events (useful right after apply) rancher kubectl -n sapl-redis describe deployment sapl-redis # Pod events (OOMKill, restarts, etc.) rancher kubectl -n sapl-redis describe pod -l app=sapl-redis ``` ### Logs ```bash # Tail live logs rancher kubectl -n sapl-redis logs -f deploy/sapl-redis # Last 100 lines rancher kubectl -n sapl-redis logs deploy/sapl-redis --tail=100 ``` ### Redis INFO ```bash # Memory usage rancher kubectl exec -n sapl-redis deploy/sapl-redis -- \ redis-cli info memory \ | grep -E 'used_memory_human|maxmemory_human|mem_fragmentation_ratio' # Connection pressure rancher kubectl exec -n sapl-redis deploy/sapl-redis -- \ redis-cli info stats \ | grep -E 'rejected_connections|instantaneous_ops_per_sec' # Key distribution per DB rancher kubectl exec -n sapl-redis deploy/sapl-redis -- redis-cli info keyspace # Recent slow queries rancher kubectl exec -n sapl-redis deploy/sapl-redis -- redis-cli slowlog get 10 # Live command sampling (1-second window) rancher kubectl exec -n sapl-redis deploy/sapl-redis -- redis-cli --latency-history -i 1 ``` ### Rate-limiter keys (DB 1) ```bash rancher kubectl exec -n sapl-redis deploy/sapl-redis -- \ redis-cli -n 1 dbsize rancher kubectl exec -n sapl-redis deploy/sapl-redis -- \ redis-cli -n 1 --scan --pattern 'rl:ip:*' | head -20 ``` --- ## Seed the UA deny list (once after first deploy) `rl:bot:ua:blocked` is a permanent Redis SET in DB 1. Each member is the SHA-256 of a **UA token** — the identifying fragment extracted after splitting on `/`, spaces, `;`, `(`, `)`, e.g.: ``` UA string: "GPTBot/1.1 (+https://openai.com/gptbot)" Tokens: GPTBot 1.1 +https: ... Hash stored: sha256("GPTBot") ``` The middleware (`_is_redis_blocked_ua`) tokenises the incoming UA the same way and checks each token hash against the cached set. The SET is fetched from Redis at most once per `RATE_LIMITER_UA_BLOCKLIST_REFRESH` seconds (default 60) per worker process. The bots in `BOT_UA_FRAGMENTS` (Python list, always active) and this Redis SET are **independent** — the Python list provides the baseline and the Redis SET allows adding new offenders at runtime **without a code deploy**. ```bash rancher kubectl exec -n sapl-redis deploy/sapl-redis -- redis-cli -n 1 \ SADD rl:bot:ua:blocked \ "$(echo -n 'GPTBot' | sha256sum | cut -d' ' -f1)" \ "$(echo -n 'ClaudeBot' | sha256sum | cut -d' ' -f1)" \ "$(echo -n 'PerplexityBot' | sha256sum | cut -d' ' -f1)" \ "$(echo -n 'Bytespider' | sha256sum | cut -d' ' -f1)" \ "$(echo -n 'AhrefsBot' | sha256sum | cut -d' ' -f1)" \ "$(echo -n 'meta-externalagent' | sha256sum | cut -d' ' -f1)" # Add a new offender at runtime (picked up within RATE_LIMITER_UA_BLOCKLIST_REFRESH seconds) rancher kubectl exec -n sapl-redis deploy/sapl-redis -- redis-cli -n 1 \ SADD rl:bot:ua:blocked "$(echo -n 'NewBot' | sha256sum | cut -d' ' -f1)" ``` --- ## Local standalone Redis (development / testing) No Kubernetes? Run Redis directly with Docker: ```bash sudo docker run --rm -p 6379:6379 redis:7-alpine \ redis-server --save "" --appendonly no ``` Then point Django at it by exporting the env var before starting the dev server: ```bash export REDIS_URL="redis://localhost:6379" export CACHE_BACKEND="redis" python manage.py runserver ``` Or add them to your local `.env` file: ``` REDIS_URL=redis://localhost:6379 CACHE_BACKEND=redis ``` > **Note**: the waffle switch `REDIS_CACHE` must also be `on` in your local > database for `start.sh` to activate the Redis backend. Run: > ```bash > python manage.py waffle_switch REDIS_CACHE on --create > ``` --- ## Update `redis.conf` without redeploying ```bash # Edit the ConfigMap rancher kubectl -n sapl-redis edit configmap redis-config # Restart the pod to pick up the new config rancher kubectl -n sapl-redis rollout restart deployment/sapl-redis ``` --- ## Gunicorn tuning `docker/startup_scripts/gunicorn.conf.py` — resolved values for the current pod budget (1600Mi RAM, 1 CPU): ```python NUM_WORKERS = int(os.getenv("WEB_CONCURRENCY", "2")) # was 3 THREADS = int(os.getenv("GUNICORN_THREADS", "4")) # was 8 TIMEOUT = int(os.getenv("GUNICORN_TIMEOUT", "120")) # was 300 max_requests = 1000 max_requests_jitter = 200 worker_max_memory_per_child = 400 * 1024 * 1024 # 400 MB — was 300 MB ``` **Per-location timeout strategy** — nginx overrides the global Gunicorn timeout per-path: | Operation | Timeout | Rationale | |-----------|---------|-----------| | Normal page rendering | 60 s | No legitimate page should take > 60 s | | API endpoints | 30 s | Stateless, fast by design | | PDF download (cached / nginx) | 30 s | nginx serves from disk, worker not involved | | PDF generation (uncached) | 180 s | Kept high — addressed in a future phase | | Large file upload | 180 s | nginx buffers upload; worker processes after | --- ## nginx real-IP and core fixes Added to `docker/config/nginx/nginx.conf` (http {} block): ```nginx # Kernel bypass — was off (bug) sendfile on; tcp_nopush on; tcp_nodelay on; # Real client IP from X-Forwarded-For set by K8s Ingress real_ip_header X-Forwarded-For; real_ip_recursive on; set_real_ip_from 10.0.0.0/8; set_real_ip_from 172.16.0.0/12; set_real_ip_from 192.168.0.0/16; ``` Without `real_ip_recursive on`, `$remote_addr` inside the pod would always be the Ingress IP, making IP-based rate limiting and blocking meaningless. --- ## Django upload settings Added to `sapl/settings.py` — files above 2 MB are streamed to disk rather than held in worker RAM. Critical for 150 MB upload support without OOM pressure: ```python FILE_UPLOAD_MAX_MEMORY_SIZE = 2 * 1024 * 1024 # 2 MB DATA_UPLOAD_MAX_MEMORY_SIZE = 10 * 1024 * 1024 # 10 MB MAX_DOC_UPLOAD_SIZE = 150 * 1024 * 1024 # 150 MB FILE_UPLOAD_TEMP_DIR = '/var/interlegis/sapl/tmp' ``` --- ## N+1 fix — `get_etiqueta_protocolos` `sapl/relatorios/views.py` — previously called `MateriaLegislativa.objects.filter()` inside a loop over protocols. Fixed to **three queries total** regardless of volume (one for protocols, one for materias, one for documentos): ```python # sapl/relatorios/views.py def get_etiqueta_protocolos(prots): prot_list = list(prots) if not prot_list: return [] # Pre-fetch MateriaLegislativa for all protocols in one query. materia_query = Q() for p in prot_list: materia_query |= Q(numero_protocolo=p.numero, ano=p.ano) materias_map = { (m.numero_protocolo, m.ano): m for m in MateriaLegislativa.objects.filter( materia_query).select_related('tipo') } # Pre-fetch DocumentoAdministrativo for all protocols in one query. documentos_map = { doc.protocolo_id: doc for doc in DocumentoAdministrativo.objects.filter( protocolo__in=prot_list).select_related('tipo') } protocolos = [] for p in prot_list: dic = {} dic['titulo'] = str(p.numero) + '/' + str(p.ano) # ... timestamp / assunto / interessado / autor fields ... materia = materias_map.get((p.numero, p.ano)) dic['num_materia'] = ( materia.tipo.sigla + ' ' + str(materia.numero) + '/' + str(materia.ano) if materia else '' ) documento = documentos_map.get(p.pk) dic['num_documento'] = ( documento.tipo.sigla + ' ' + str(documento.numero) + '/' + str(documento.ano) if documento else '' ) dic['ident_processo'] = dic['num_materia'] or dic['num_documento'] protocolos.append(dic) return protocolos ``` --- ## Rate limiting — two layers, two jobs SAPL enforces rate limits at two independent layers. They use different algorithms and protect different things; their thresholds must be tuned separately. ### Layer 1 — nginx `limit_req` (leaky bucket) Defined in `docker/config/nginx/nginx.conf` (zones) and `sapl.conf` (burst). ``` sapl_general rate=30r/m # 1 token every 2 s sapl_heavy rate=10r/m # 1 token every 6 s (PDF/report endpoints) ``` `burst=N nodelay` means nginx accepts up to N requests instantly above the current token level, then enforces the drip rate. Requests beyond the burst cap return 429 before reaching Gunicorn — **zero Python cost**. Burst values are set at container startup via env vars: | Env var | Default | Location | |---------|---------|----------| | `NGINX_BURST_GENERAL` | `60` | `location /`, `location /media/` | | `NGINX_BURST_API` | `60` | `location /api/` | | `NGINX_BURST_HEAVY` | `20` | `location /relatorios/` | Defaults are 2× the zone's per-minute rate, so a user can spend a full minute's quota in a single burst before the leaky bucket takes over. ### Layer 2 — Django `RateLimitMiddleware` (sliding window) Defined in `sapl/middleware/ratelimit.py`, backed by Redis DB 1. Requests that pass nginx reach Python. The middleware counts them in a 60-second sliding window per IP (anonymous) or per user (authenticated): | Env var | Default | Scope | |---------|---------|-------| | `RATE_LIMITER_RATE` | `35/m` | Anonymous IP | | `RATE_LIMITER_RATE_AUTHENTICATED` | `120/m` | Authenticated user | | `RATE_LIMITER_RATE_BOT` | `5/m` | *(reserved — bots are currently blocked outright, not counted)* | | `RATE_LIMITER_UA_BLOCKLIST_REFRESH` | `60` s | How often each worker re-fetches `rl:bot:ua:blocked` from Redis | When the window count hits the threshold the IP/user is written to a Redis blocked-set with a 300 s TTL and subsequent requests return 429 with `Retry-After: 300` — without touching the database. Decision flow inside `RateLimitMiddleware._evaluate()`: ``` 1. IP in whitelist? → pass (no further checks) 1a. UA matches BOT_UA_FRAGMENTS list? → 429 reason=known_ua 1b. UA token hash in rl:bot:ua:blocked SET? → 429 reason=redis_ua 2. IP in rl:ip:{ip}:blocked? → 429 reason=ip_blocked 2b. Path extension in RATE_LIMIT_SCANNER_EXTENSIONS? → SET blocked, 429 reason=scanner_probe 3. Authenticated user? 3a. User in rl:{ns}:user:{uid}:blocked? → 429 reason=user_blocked 3b. Suspicious headers (no Accept/AL)? → 429 reason=suspicious_headers_auth 3c. User request count ≥ auth threshold? → SET blocked, 429 reason=auth_user_rate 4. Anonymous: 4a. Suspicious headers? → 429 reason=suspicious_headers 4b. IP request count ≥ anon threshold? → SET blocked, 429 reason=ip_rate 4c. NS/IP window count ≥ anon threshold? → SET blocked, 429 reason=ua_rotation → pass ``` ### Decision flow diagram ```mermaid flowchart TD REQ([Request]) --> C1 C1{"Known bot UA?"} C1 -- "yes — substring in BOT_UA_FRAGMENTS" --> R_UA([429\nknown_ua]) C1 -- no --> C1B C1B{"Redis UA deny list?"} C1B -- "yes — token hash in rl:bot:ua:blocked" --> R_RUA([429\nredis_ua]) C1B -- no --> C2 C2{"IP blocked?"} C2 -- "yes — rl:ip:IP:blocked exists" --> R_IPB([429\nip_blocked]) C2 -- no --> C2B C2B{"Scanner extension?\n.php .asp .aspx …"} C2B -- yes --> SIPB["SET rl:ip:IP:blocked TTL 300 s"] SIPB --> R_SCN([429\nscanner_probe]) C2B -- no --> C3 C3{"Authenticated?"} C3 -- yes --> C3A C3 -- no --> C4A subgraph AUTH ["Authenticated"] C3A{"User blocked?"} C3A -- "yes — rl:ns:user:UID:blocked" --> R_UB([429\nuser_blocked]) C3A -- no --> C3B C3B{"Suspicious headers?\nno Accept-Language + no Accept"} C3B -- yes --> R_SH([429\nsuspicious_headers_auth]) C3B -- no --> C3C C3C{"User rate ≥ 120/min?"} C3C -- yes --> SUB["SET rl:ns:user:UID:blocked TTL 300 s"] SUB --> R_AUR([429\nauth_user_rate]) C3C -- no --> PASS_A([✓ pass]) end subgraph ANON ["Anonymous"] C4A{"Suspicious headers?\nno Accept-Language + no Accept"} C4A -- yes --> R_ASH([429\nsuspicious_headers]) C4A -- no --> C4B C4B{"IP rate ≥ 35/min?"} C4B -- yes --> SIPR["SET rl:ip:IP:blocked TTL 300 s"] SIPR --> R_IPR([429\nip_rate]) C4B -- no --> C4C C4C{"NS/IP window hit\n≥ 35 in bucket?"} C4C -- yes --> SUAR["SET rl:ip:IP:blocked TTL 300 s"] SUAR --> R_UAR([429\nua_rotation]) C4C -- no --> PASS_N([✓ pass]) end ``` ### Enforcement graduation order Roll out to canary pods first; promote check-by-check in order of false-positive risk: | Order | Check | Reason | Risk | Condition to promote | |-------|-------|--------|------|---------------------| | 1st | `known_ua` | Substring in hardcoded `BOT_UA_FRAGMENTS` list | Zero | UA strings are deterministic | | 2nd | `redis_ua` | Token hash in `rl:bot:ua:blocked` SET | Zero | Keys only set manually by operators | | 3rd | `ip_blocked` | Marker set by prior proven-bad requests | Zero | Fast-path only, no new blocks created | | 4th | `scanner_probe` | Path ext in `RATE_LIMIT_SCANNER_EXTENSIONS` | Zero | Django never legitimately serves `.php`/`.asp`/etc. | | 5th | `ip_rate` | Rolling IP counter ≥ 35/min | Low | Threshold calibrated from canary logs | | 6th | `suspicious_headers` | No Accept-Language **and** no Accept | Medium | Confirmed no legitimate clients omit both headers | | 7th | `ua_rotation` (ns/window) | NS/IP clock-aligned bucket ≥ 35 | Medium | NAT IP whitelist in place (see Open Questions) | ### Decorator migration For views where `django-ratelimit` decorators already exist: | Endpoint type | Action | Reason | |---------------|--------|--------| | List views (GET) | Remove after middleware stable | Middleware covers equivalent threshold | | Detail views (GET) | Remove after middleware stable | Middleware covers equivalent threshold | | Search / filter views | Remove last | Expensive queries — keep stricter per-view limit until traffic data confirms safety | | PDF / file generation | **Keep permanently** | Most expensive endpoint; per-view limit tighter than global | | Write endpoints (POST/PUT/DELETE) | **Keep permanently** | Different abuse surface | | Auth endpoints (login, reset) | **Keep permanently** | Credential stuffing; must be independent of IP rate | ### Why they are not the same number | | nginx burst | Django threshold | |-|------------|-----------------| | **Algorithm** | Leaky bucket — token refills over time | Sliding window — hard count per 60 s | | **Protects** | Gunicorn workers from being flooded | Per-client fairness, business policy | | **Tuned by** | Capacity of the server | Acceptable request volume per client | | **Failure mode** | Workers overwhelmed | Legitimate user over-browsing | A user loading a page quickly may fire 5–10 Django requests in two seconds. With `rate=30r/m` (1 token/2 s) and `burst=60` they absorb that fine; the leaky bucket refills before they click the next link. The Django threshold (35/m sliding window) catches sustained automated traffic from a single IP that looks like scraping even if it arrives slowly enough to beat the nginx burst cap. --- ## Request routing — how nginx reaches Django `proxy_pass http://sapl_server` forwards the HTTP request — with the original path intact — to the Gunicorn Unix socket. Django doesn't know or care that nginx is in front; it sees a standard HTTP request. ``` GET /media/foo.pdf │ ▼ nginx (sapl.conf) location /media/ → proxy_pass to Unix socket │ ▼ Gunicorn (WSGI server) receives raw HTTP, calls Django WSGI application │ ▼ Django middleware stack (settings.MIDDLEWARE) RateLimitMiddleware → pass or 429 │ ▼ Django URL router (sapl/urls.py) r'^media/(?P.*)$' → serve_media │ ▼ serve_media(request, path='foo.pdf') returns HttpResponse with X-Accel-Redirect: /internal/media/foo.pdf │ ▼ nginx sees X-Accel-Redirect header /internal/media/ internal location → reads file from disk → sends to client ``` nginx does no routing beyond picking a `location` block. The mapping from URL path to Python function lives entirely in `sapl/urls.py`. `proxy_pass` is just a pipe. --- ## Media file serving — `serve_media` and X-Accel-Redirect All `/media/` requests (public and private) are routed through Gunicorn so that Django middleware runs on every hit. Nginx serves the file bytes via `X-Accel-Redirect` — the Gunicorn worker is freed as soon as it sends the response headers. ### nginx locations (`docker/config/nginx/sapl.conf`) ```nginx # Static files — no rate limiting, no proxy; 90-minute browser cache. location /static/ { alias /var/interlegis/sapl/collected_static/; expires 90m; add_header Cache-Control "public, max-age=5400"; } # Proxied to Gunicorn — Django middleware + serve_media() run here. location /media/ { limit_req zone=sapl_general burst=${NGINX_BURST_GENERAL} nodelay; proxy_pass http://sapl_server; } # Internal — only reachable via X-Accel-Redirect, not by external clients. location /internal/media/ { internal; alias /var/interlegis/sapl/media/; sendfile on; etag on; } ``` Upload endpoints (`/protocoloadm/criar-protocolo`, `/materia/.*upload`, `/norma/.*upload`) no longer have a dedicated `location` block — they fall through to `location /` which applies the `sapl_general` zone. ### Django view (`sapl/base/media.py`) `serve_media(request, path)` — registered at `^media/(?P.*)$` in `sapl/urls.py`. Per-request steps: 1. **Path traversal guard** — `os.path.abspath` check; raises 404 on escape. 2. **Auth gate** — `documentos_privados/` paths require an authenticated session; redirects to login otherwise. 3. **Path counter** — increments `rl:{ns}:path:{sha256}:reqs` in Redis DB 1 (TTL = `MEDIA_PATH_COUNTER_TTL`). 4. **Serve** — in DEBUG: `django.views.static.serve` directly. In production: `X-Accel-Redirect: /internal/media/`. Nginx sets `Content-Type` from its own `mime.types`. ### Settings | Setting | Default | Purpose | |---------|---------|---------| | `MEDIA_PATH_COUNTER_TTL` | `60` s | TTL for both URL-path and storage-path counters (DB 1) | ### File serving decision matrix | File type | Size | Strategy | |-----------|------|----------| | Logos / images | Any | nginx `alias` + `sendfile` + ETag + `Cache-Control` | | Small PDFs | ≤ 360 KB | nginx direct + ETag | | Medium PDFs | 360 KB – 2 MB | nginx direct + ETag + rate limit | | Large PDFs | > 2 MB | nginx direct + strict rate limit; never Redis | | LGPD-restricted | Any | Django `serve_media` → `X-Accel-Redirect` → nginx (access control enforced) | | Public `/media/` | Any | Django `serve_media` → `X-Accel-Redirect` → nginx (middleware runs; path counter written) | ### Why Redis is not needed for PDFs With the full mitigation stack active: - **ASN blocking** drops datacenter bot traffic at nginx (zero Python cost) - **UA blocking** drops known-UA bots at nginx (zero Python cost) - **Shared Redis rate counters** enforce limits across all pods - **ETags** convert repeat requests to 304 responses with zero bytes transferred - **`sendfile on`** means disk reads bypass userspace entirely Redis PDF caching would solve "high request volume reaching the file layer" — but that problem no longer exists once the above stack is active. For `Brasão - Foz do Iguaçu.png` (392 KB × 14,512 requests = 5.6 GB), a 50% conditional-request hit rate saves ~2.8 GB immediately — without any Redis. --- ## Key schema reference | DB | Use case | Key pattern | TTL | Threshold | Constant | |----|----------|-------------|-----|-----------|----------| | 0 | Page / view cache | `cache:{ns}:*` | 300 s (default) | — | `CACHES['default']` KEY_PREFIX | | 0 | Static file cache (logos) | `static:{ns}:{sha256}` | 3 – 24 h | — | *Future* (requires OpenResty/Lua) | | 0 | File content cache (≤ 360 KB) | `file:{ns}:{sha256}` | 1 h | — | *Future* | | 1 | IP rate-limit counter | `rl:ip:{ip}:reqs` | 60 s | 35 (`RATE_LIMITER_RATE`) | `RL_IP_REQUESTS` | | 1 | IP blocked marker | `rl:ip:{ip}:blocked` | 300 s | — | `RL_IP_BLOCKED` | | 1 | User rate-limit counter | `rl:{ns}:user:{uid}:reqs` | 60 s | 120 (`RATE_LIMITER_RATE_AUTHENTICATED`) | `RL_USER_REQUESTS` | | 1 | User blocked marker | `rl:{ns}:user:{uid}:blocked` | 300 s | — | `RL_USER_BLOCKED` | | 1 | Namespace/IP sliding window | `rl:{ns}:ip:{ip}:w:{bucket}` | 120 s | 35 (`RATE_LIMITER_RATE`) | `RL_NS_WINDOW` | | 1 | Path counter (`/media/`) | `rl:{ns}:path:{sha256}:reqs` | 60 s | — (observability only) | `RL_PATH_REQUESTS` | | 1 | Path counter (`/static/`) | `rl:{ns}:path:{sha256}:reqs` | 60 s | — | *Future* (requires OpenResty/Lua) | | 1 | UA deny list | `rl:bot:ua:blocked` | permanent SET | — (block on match) | `RL_UA_BLOCKLIST` | | 2 | Django Channels | `channels:*` | session TTL | — | *Future* | ### What each counter catches — and misses **`rl:ip:{ip}:reqs` — global rolling IP counter** Catches: any sustained anonymous volume from a single IP regardless of namespace, path, or User-Agent — pure request rate. Misses: a user legitimately accessing several municipality SAPLs simultaneously; their requests accumulate across namespaces into one global count and may trip the threshold even though no individual SAPL is being abused. Also misses a timing-aware scraper that paces exactly 34 req/min: the 60 s TTL resets from the first request, so the attacker can safely send 34, wait for reset, repeat forever. --- **`rl:ip:{ip}:blocked` — IP short-circuit marker** Written when `rl:ip:{ip}:reqs` hits the anonymous threshold (step 4b) or when the namespace/IP bucket hits the threshold (step 4c). Checked at step 2 — before any counting — so a blocked IP never increments any counter on subsequent requests. Catches: saves Redis INCR + EXPIRE calls for every request from an already-blocked IP; the 300 s TTL is a hard cooldown regardless of how many requests arrive. Misses: the TTL is fixed — a persistent attacker simply waits 300 s and gets another full window quota. Also, because the key is global (no namespace), an IP blocked for one municipal SAPL is blocked for all SAPLs on the same pod — collateral effect for shared IPs. --- **`rl:{ns}:ip:{ip}:w:{bucket}` — namespace-scoped clock-aligned bucket** Catches: sustained scraping against a *specific* municipal SAPL that stays just under the global threshold; a scraper pacing 34 req/min globally across namespaces still accumulates in the per-namespace bucket. Clock alignment (bucket = `time() // 60`) means a burst straddling a minute boundary still contributes to the *next* bucket for 120 s (2× TTL), making precise timing attacks harder. Misses: an IP that floods one namespace to exactly 34 req/min: it never reaches 35 in the bucket either. Cross-namespace legitimate traffic that happens to land within the same clock minute — same blind spot as `rl:ip:*` but scoped lower. **Why this key is namespace-scoped** Five arguments for `rl:{ns}:ip:{ip}:w:{bucket}` over a global `rl:ip:{ip}:w:{bucket}`: 1. **Matches the observed attack pattern.** The botnet in §Bot Traffic Profile targets one SAPL at a time, not the fleet evenly. A scraper hammering `fortaleza-ce` at 34 req/min has a namespace counter of 34 and a global counter of 34. Without the namespace the two keys are redundant — the window adds no new signal. With it, a scraper that legitimately distributes across 5 SAPLs (7 req/min each, 35 globally) is caught globally but *not* per-SAPL — correct behaviour, since no single SAPL is being abused. 2. **Two counters defeat two different gaming strategies.** `rl:ip:{ip}:reqs` uses a rolling TTL (starts on the first INCR). A scraper that knows this can send 34 requests, wait ~61 s for the key to expire, and repeat indefinitely. The clock-aligned window resets at wall-clock minute boundaries. To game *both* simultaneously the attacker must time bursts to expire the rolling key *and* land entirely within one clock window — two independent constraints that are hard to satisfy together. 3. **Without the namespace it duplicates the global counter.** All pods share the same Redis. A global `rl:ip:{ip}:w:{bucket}` would aggregate that IP's traffic from every pod — exactly what `rl:ip:{ip}:reqs` already does, just with different reset timing. Two keys measuring the same dimension is wasted INCR overhead with no added signal. 4. **Multi-SAPL legitimate IPs are not penalised.** Municipal IT departments, ISP shared exit nodes, and Googlebot all produce high global request rates while being individually harmless to any one SAPL. A namespaced window lets them access 10 SAPLs at 3 req/min each without triggering a per-SAPL block, while the global counter still catches them if their total rate is abusive. 5. **Consistent with the established `{ns}` isolation contract.** All user-keyed (`rl:{ns}:user:{uid}:*`) and path-keyed (`rl:{ns}:path:{sha256}:reqs`) entries are namespace-scoped. A global window key would break the invariant that per-tenant data is isolated — complicating key-space inspection, `SCAN`-based dashboards, and future per-tenant rate adjustments. --- **`rl:{ns}:user:{uid}:reqs` — authenticated user counter** Catches: an authenticated account being used as a scraping credential — even if the requests come from many different IPs (e.g., distributed proxy pool), all requests share the same `uid` and accumulate in one counter. Misses: a credential that is shared across multiple legitimate users in the same office; all their activity adds up to one counter and can trip the 120/min threshold during a busy session. --- **`rl:{ns}:user:{uid}:blocked` — authenticated user short-circuit marker** Written when `rl:{ns}:user:{uid}:reqs` hits the authenticated threshold (step 3c). Checked at step 3a — before counting — so a blocked user never increments their counter on subsequent requests during the 300 s cooldown. Catches: credential-stuffing or runaway automation using a valid session — once the 120/min threshold is hit, the account is locked out immediately for 300 s. Unlike the IP marker, the block is namespace-scoped, so the same user account can be blocked on one SAPL but still active on another. Misses: same fixed-TTL weakness as the IP marker — a persistent attacker resumes after 300 s. An account shared by multiple legitimate users (e.g., a departmental login) can be locked out during peak collaborative use. --- **`rl:{ns}:path:{sha256}:reqs` — per-media-file URL counter** Currently observability-only (no threshold enforced). Intended for future hot-file detection: a single document being hammered by many IPs would show a spike in this counter even if no individual IP exceeds the IP threshold. Misses: nothing is blocked today. Once a threshold is added, it will miss distributed access where many IPs each download the file once (legitimate CDN pre-warming or public interest event). --- **`rl:bot:ua:blocked` — runtime UA deny list** Catches: new bot UA tokens added at runtime via `redis-cli SADD` without a code deploy; picked up within `RATE_LIMITER_UA_BLOCKLIST_REFRESH` seconds (default 60) per worker. Complements the hardcoded `BOT_UA_FRAGMENTS` Python list. Misses: bots that rotate UA tokens on every request (no single token accumulates); bots that impersonate a valid browser UA completely (no known fragment to match). --- ## Dynamic page caching **Goal**: Eliminate ORM queries for anonymous bot requests on list views. **Prerequisite**: Phase 1 (shared Redis, `CACHE_BACKEND=redis`). Many SAPL list views (`pesquisar-materia`, `norma`, etc.) are not truly dynamic for anonymous users between edits. A bot hammering `?page=1` through `?page=100` triggers 100 ORM queries per pod. With Redis page cache, each unique URL is queried once per TTL across the entire fleet. ```python # Apply to anonymous list views only — AnonCachePageMixin already wired to materia/sessao detail views. from django.views.decorators.cache import cache_page from django.utils.decorators import method_decorator @method_decorator(cache_page(60 * 5), name='dispatch') # 5-minute TTL class PesquisarMateriaView(FilterView): ... ``` > **Safety check**: `cache_page` sets `Cache-Control: private` for authenticated sessions automatically. > Verify this is working before deploying — accidentally caching a session-aware response is a data leak. ### Cache TTL guidelines | View type | TTL | Reasoning | |-----------|-----|-----------| | Matéria list (anonymous) | 300 s | Changes infrequently between sessions | | Norma list (anonymous) | 300 s | Same | | Parlamentar list | 3600 s | Changes rarely | | Search results | 60 s | Query-dependent; shorter TTL safer | | Authenticated views | Never | `cache_page` respects this automatically | | PDF generation | Never | Too large — serve from disk via nginx | --- ## HTTP Conditional Requests Two complementary mechanisms eliminate redundant work for unchanged content. ### `ConditionalGetMiddleware` (all views) Added to `MIDDLEWARE` in `sapl/settings.py` (after `CommonMiddleware`). For every Django response it: 1. Generates a weak `ETag` from an MD5 of the response body if none is set. 2. Compares against the client's `If-None-Match` / `If-Modified-Since`. 3. Returns `304 Not Modified` (no body) on a match. 4. Handles `HEAD` requests by stripping the body and keeping headers. **Caveat**: the view still executes and renders before the check fires. The saving is bandwidth, not CPU/DB work. ### `@condition` decorator — materia and norma detail views For `MateriaLegislativaCrud.DetailView` and `NormaCrud.DetailView` a cheap freshness function runs *before* the view body: ```python # sapl/materia/views.py def _materia_last_modified(request, *args, **kwargs): return MateriaLegislativa.objects.filter( pk=kwargs['pk'] ).values_list('data_ultima_atualizacao', flat=True).first() def _materia_etag(request, *args, **kwargs): ts = _materia_last_modified(request, *args, **kwargs) return f'{kwargs["pk"]}-{ts.timestamp()}' if ts else None @method_decorator(condition(etag_func=_materia_etag, last_modified_func=_materia_last_modified), name='get') class DetailView(AnonCachePageMixin, Crud.DetailView): ... ``` `NormaCrud.DetailView` follows the same pattern with `_norma_last_modified` / `_norma_etag` querying `NormaJuridica.ultima_edicao`. **On a cache hit**: one `VALUES` query fires, Django returns `304` — view body, template render, and ORM work are all skipped. **Signal used**: `data_ultima_atualizacao` (`auto_now=True`) — updated by Django on every `save()`, so the ETag is invalidated automatically whenever the record changes. --- ## Open Questions | # | Question | Status | Blocks | |---|----------|--------|--------| | 1 | Does Chrome/98.0.4758 impersonator appear consistently in nginx access logs? | Needs investigation | UA block safety | | 2 | Which legislative house IPs can be pre-whitelisted in `RATE_LIMIT_WHITELIST_IPS`? | No list yet — obtain in the future. Setting is **optional / future**. | Enforcement safety for NAT users | | 3 | `CONN_MAX_AGE` tuning | Currently **300 s** (`sapl/settings.py`). Evaluate whether to reduce given worker recycling at 400 MB. | Gunicorn tuning | | 4 | WebSocket voting panel priority | Separate project. Resumes after Redis is on k8s, bot siege addressed, and OOM pressure reduced. | Phase 5 sequencing |