|
|
|
@ -786,6 +786,231 @@ are env-var configurable at container start via `start.sh` defaults. |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
## Rate Limiting — Architecture Diagrams |
|
|
|
|
|
|
|
### NAT Thundering Herd — Before the Fix |
|
|
|
|
|
|
|
During a live vote all councilmembers reload simultaneously. nginx sees one |
|
|
|
IP, exhausts its bucket, and returns 429 before Django is ever involved. |
|
|
|
Django's per-user counter (NAT-safe) is never consulted. |
|
|
|
|
|
|
|
``` |
|
|
|
Office / Chamber — behind one NAT IP |
|
|
|
┌──────────────────────────────────────────────────────┐ |
|
|
|
│ Councilmember A browser reload ──┐ │ |
|
|
|
│ Councilmember B browser reload ──┤ │ |
|
|
|
│ Councilmember C browser reload ──┤ ~24 req/s │ |
|
|
|
│ Staff tab 1 browser reload ──┤ same public IP │ |
|
|
|
│ Staff tab 2 browser reload ──┘ │ |
|
|
|
└────────────────────────────┬─────────────────────────┘ |
|
|
|
│ all requests look identical to nginx |
|
|
|
▼ |
|
|
|
┌─────────────────────────────────────┐ |
|
|
|
│ nginx sapl_general │ |
|
|
|
│ rate=30r/m burst=60 nodelay │ |
|
|
|
│ │ |
|
|
|
│ token bucket: 0 tokens remaining │ |
|
|
|
│ → 429 returned immediately │ |
|
|
|
└──────────────────┬──────────────────┘ |
|
|
|
│ |
|
|
|
╳ Django never reached |
|
|
|
╳ rl:ip:{ip}:reqs never incremented |
|
|
|
╳ rl:user:{uid}:reqs never consulted |
|
|
|
│ |
|
|
|
▼ |
|
|
|
429 for all N users in the org |
|
|
|
recovery: nginx bucket refill (~3–10 min) |
|
|
|
NOT a Django 300s block — Redis never written |
|
|
|
``` |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
### NAT Thundering Herd — After the Session Bypass Fix |
|
|
|
|
|
|
|
``` |
|
|
|
Office / Chamber — behind one NAT IP |
|
|
|
┌──────────────────────────────────────────────────────┐ |
|
|
|
│ Councilmember A /voto-individual/ reload ──┐ │ |
|
|
|
│ Councilmember B /voto-individual/ reload ──┤ │ |
|
|
|
│ Councilmember C /sessao/2600/ordemdia ───────┤ │ |
|
|
|
│ Staff tab /sessao/pauta-sessao/2600/ ──┘ │ |
|
|
|
└────────────────────────────┬─────────────────────────┘ |
|
|
|
▼ |
|
|
|
┌─────────────────────────────────────┐ |
|
|
|
│ nginx │ |
|
|
|
│ │ |
|
|
|
│ location ~ ^/voto-individual/ ─┐ │ |
|
|
|
│ location ~ ^/sessao/\d+ ─┤ │ no limit_req |
|
|
|
│ location ~ ^/painel/\d+/dados ─┘ │ pass through |
|
|
|
└──────────────────┬──────────────────┘ |
|
|
|
▼ |
|
|
|
┌─────────────────────────────────────┐ |
|
|
|
│ Django RateLimitMiddleware │ |
|
|
|
│ RATE_LIMIT_BYPASS_PATHS match? │ |
|
|
|
│ → yes: return get_response() │ |
|
|
|
└──────────────────┬──────────────────┘ |
|
|
|
▼ |
|
|
|
✓ View served |
|
|
|
``` |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
### nginx Zone Architecture — Before vs After |
|
|
|
|
|
|
|
**Before** — all traffic sharing one bucket per IP: |
|
|
|
|
|
|
|
``` |
|
|
|
/media/page.pdf ──┐ |
|
|
|
/materia/123/ ───┤──► sapl_general rate=30r/m burst=60 |
|
|
|
/api/materia/? ───┘ |
|
|
|
|
|
|
|
Problem: 20 media attachments on a page burn 20 tokens |
|
|
|
from the same budget as the HTML page load |
|
|
|
``` |
|
|
|
|
|
|
|
**After** — four independent buckets: |
|
|
|
|
|
|
|
``` |
|
|
|
location / ──► sapl_general rate=90r/m burst=180 |
|
|
|
location /media/ ──► sapl_media rate=180r/m burst=180 |
|
|
|
location /api/ ──► sapl_api rate=60r/m burst=120 |
|
|
|
location /relatorios/ ──► sapl_heavy rate=10r/m burst=20 (nodelay) |
|
|
|
location /sessao/\d+ ──► (no zone) exempt |
|
|
|
location /voto-indiv.. ──► (no zone) exempt |
|
|
|
location /static/ ──► (no zone) disk-served, no Django |
|
|
|
``` |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
### Anonymous /api/ NAT Problem — Before vs After |
|
|
|
|
|
|
|
**Before** — anonymous API hits polluted the global IP counter: |
|
|
|
|
|
|
|
``` |
|
|
|
10 staff, JS polling /api/ → 120 req/min from NAT IP |
|
|
|
│ |
|
|
|
▼ |
|
|
|
Django _evaluate_anonymous |
|
|
|
INCR rl:ip:{ip}:reqs → 120 ≥ threshold |
|
|
|
SET rl:ip:{ip}:blocked EX 300 ◄── global block |
|
|
|
│ |
|
|
|
▼ |
|
|
|
Next GET /materia/ → 429 ip_blocked |
|
|
|
Next GET /sessao/ → 429 ip_blocked |
|
|
|
Entire org locked out of ALL paths for 300s |
|
|
|
``` |
|
|
|
|
|
|
|
**After** — anonymous API skips the IP counter entirely: |
|
|
|
|
|
|
|
``` |
|
|
|
10 staff, JS polling /api/ → 120 req/min from NAT IP |
|
|
|
│ |
|
|
|
▼ |
|
|
|
nginx sapl_api rate=60r/m burst=120 |
|
|
|
(throttles sustained traffic) |
|
|
|
│ |
|
|
|
▼ |
|
|
|
Django quota check: 500/day not exceeded → pass |
|
|
|
Anonymous /api/: early return, no _evaluate() |
|
|
|
rl:ip:{ip}:reqs NOT incremented |
|
|
|
rl:ip:{ip}:blocked NOT written |
|
|
|
│ |
|
|
|
▼ |
|
|
|
Page requests from same IP: unaffected ✓ |
|
|
|
Worst case: 500 API req/day quota exhausted |
|
|
|
→ only API access blocked, pages still work |
|
|
|
``` |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
### Authenticated Rate Breach — Before vs After |
|
|
|
|
|
|
|
``` |
|
|
|
BEFORE AFTER |
|
|
|
────────────────────────────────── ────────────────────────────────── |
|
|
|
User clicks fast: 241 req in 60s User clicks fast: 241 req in 60s |
|
|
|
│ │ |
|
|
|
▼ ▼ |
|
|
|
count ≥ 240 (auth threshold) count ≥ 240 (auth threshold) |
|
|
|
│ │ |
|
|
|
▼ ▼ |
|
|
|
SET rl:user:{uid}:blocked EX 300 return 429 for this request only |
|
|
|
ZADD rl:index:blocked_users (no SET, no ZADD) |
|
|
|
│ │ |
|
|
|
▼ ▼ |
|
|
|
All requests for 300s → 429 T+60s: counter key expires |
|
|
|
User locked out for 5 minutes User recovers automatically |
|
|
|
No self-recovery possible No admin intervention needed |
|
|
|
``` |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
### Enforcement Stack Per Path — Trade-off Summary |
|
|
|
|
|
|
|
``` |
|
|
|
Path nginx zone Django Block key? Notes |
|
|
|
───────────────────── ───────────────── ────────────── ────────── ────────────────────── |
|
|
|
/static/* none none — disk-served |
|
|
|
/painel/<pk>/dados none (bypass) none (bypass) — high-freq polling |
|
|
|
/voto-individual/* none (bypass) none (bypass) — live vote |
|
|
|
/sessao/<pk>/* none (bypass) none (bypass) — live session |
|
|
|
/media/* sapl_media anon counter anon: yes auth gate in serve_media |
|
|
|
180r/m b=180 runs auth: no |
|
|
|
/api/* (anonymous) sapl_api quota only no ← no IP counter; no |
|
|
|
60r/m b=120 500/day collateral NAT block |
|
|
|
/api/* (auth) sapl_api per-user 240/m no (soft) per-uid, NAT-safe |
|
|
|
60r/m b=120 counter runs |
|
|
|
/relatorios/* sapl_heavy counter runs anon: yes tight — PDF generation |
|
|
|
10r/m b=20 |
|
|
|
/* (everything else) sapl_general counter runs anon: yes normal browsing |
|
|
|
90r/m b=180 auth: no auth: 429, resets in 60s |
|
|
|
``` |
|
|
|
|
|
|
|
`anon: yes` — anonymous IP gets a 300s block key on breach (all paths locked) |
|
|
|
`auth: no` — authenticated users get 429 for that request; counter expires in 60s |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
### The Fundamental NAT Constraint |
|
|
|
|
|
|
|
``` |
|
|
|
IP-based rate limiting cannot distinguish these two scenarios: |
|
|
|
|
|
|
|
Legitimate (15 users, vote opens simultaneously) |
|
|
|
┌─────────────────────────────────────────────┐ |
|
|
|
│ User 1 ──► GET /voto-individual/ │ |
|
|
|
│ User 2 ──► GET /voto-individual/ 15 req/s │ |
|
|
|
│ ... 1 IP │ |
|
|
|
│ User 15 ──► GET /sessao/2600/ │ |
|
|
|
└─────────────────────────────────────────────┘ |
|
|
|
|
|
|
|
Bot (1 process, 15 threads, scraping) |
|
|
|
┌─────────────────────────────────────────────┐ |
|
|
|
│ Thread 1 ──► GET /materia/1/ │ |
|
|
|
│ Thread 2 ──► GET /materia/2/ 15 req/s │ |
|
|
|
│ ... 1 IP │ |
|
|
|
│ Thread 15 ──► GET /materia/15/ │ |
|
|
|
└─────────────────────────────────────────────┘ |
|
|
|
|
|
|
|
To nginx and an IP counter: identical. |
|
|
|
|
|
|
|
Mitigations applied |
|
|
|
┌──────────────────────────────────────────────────────────────────┐ |
|
|
|
│ Known safe high-freq paths → bypass at both layers │ |
|
|
|
│ Authenticated users → per-user counter (uid), NAT-safe │ |
|
|
|
│ Anonymous /api/ → quota only, no IP counter │ |
|
|
|
│ Everything else (anon) → IP counter + 300s block │ |
|
|
|
└──────────────────────────────────────────────────────────────────┘ |
|
|
|
|
|
|
|
Long-term |
|
|
|
┌──────────────────────────────────────────────────────────────────┐ |
|
|
|
│ APP_ACCESS_KEYs per tenant → quota per org, not per IP │ |
|
|
|
│ WebSocket push for voting → eliminates polling bursts │ |
|
|
|
└──────────────────────────────────────────────────────────────────┘ |
|
|
|
``` |
|
|
|
|
|
|
|
--- |
|
|
|
|
|
|
|
## Session/voting bypass (2026-05-06) |
|
|
|
|
|
|
|
### Problem |
|
|
|
|