ngx_http_lua_module depends on the nginx development kit (NDK) for the
ndk_set_var_value symbol. Must install libnginx-mod-http-ndk and load
ndk_http_module.so before ngx_http_lua_module.so.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
libnginx-mod-http-lua is a dynamic module in Debian nginx; it must be
explicitly loaded with load_module or Lua directives are unrecognized.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lua-resty-redis is an OpenResty library not packaged in Debian repos.
Download resty_redis.lua from upstream and install it to /usr/lib/lua/resty/redis.lua
at image build time. Update lua_package_path to match.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OpenResty has no arm64 packages for Debian Bookworm; the official repo
only publishes amd64. Switch to Debian's own packages which support both
architectures and avoid any external repo setup:
nginx libnginx-mod-http-geoip2 libnginx-mod-http-lua lua-resty-redis
GeoIP2 C module returns (same nginx version → compatible). ASN/UA blocking
goes back to nginx if() blocks. blocklist.lua handles only the Redis checks
(prefix shared dict + pipelined GET for global/API block keys).
lua_package_path set to /usr/share/lua/5.1/ where Debian installs resty.*.
All paths revert to /etc/nginx/; start.sh reverts to /usr/sbin/nginx.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
apt 2.6 (Bookworm) accepts ASCII-armored keys via signed-by directly
when the file has a .asc extension. This removes the gpg pipeline whose
silent failure (no pipefail in /bin/sh) was producing an empty keyring.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The echo "..." \ <newline> > file pattern has the backslash consumed
by the Dockerfile parser, leaving a bare redirect that /bin/sh may not
handle reliably. Switch to pipe through tee for both writes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace nginx + libnginx-mod-http-geoip2 with OpenResty so that blocked
IPs are rejected before reaching Gunicorn, saving worker CPU on DDoS.
nginx layer (read-only, DB 1):
- blocklist.lua: UA check (nginx map var), ASN check (lua-resty-maxminddb,
replaces geoip2 C module), IP-prefix check (shared dict refreshed every
60s, 4-candidate O(1) lookup), pipelined GET for global IP block and
per-tenant API block. Parses REDIS_URL. Fail-open on Redis error.
- lua_shared_dict ip_prefix_blocked 1m: in-process prefix cache.
- init_by_lua_block: opens MaxMind ASN DB once in master process.
- init_worker_by_lua_block: refreshes prefix SET from Redis every 60s.
Django (ratelimit.py):
- _refresh_ip_prefix_blocklist: normalises entries to trailing-dot form
on load so per-request checks are O(1) set membership, not iteration.
- _is_ip_prefix_blocked: 4-candidate check (p1., p1.p2., p1.p2.p3., ip)
against the local set; same 60s refresh cadence as before.
Capacity (1,200 tenants, single Redis):
- Django pool: max_connections 6 → 3 (7,200 peak connections).
- nginx keepalive pool: 1 connection/worker (4,800 peak connections).
- Total: ~12,200 connections — 39% headroom under maxclients 20,000.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously: one STRING key per (ns, date, reason) — 1,200 tenants × 10
reasons × 8-day TTL = 96k live keys, each increment via Lua eval.
Now: one HASH key per (ns, date), field = reason. Reduces live key count
10× to 9,600. Uses _hincrby_with_ttl (already exists for API quota) so
no new Lua script is needed. HGETALL returns all reasons for a tenant in
one round-trip instead of requiring SCAN + GET.
No cross-tenant contention existed before (keys are per-ns); this change
reduces per-key overhead and makes monitoring simpler.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Caches anonymous responses for 120 s (PAGE_CACHE_TTL_LIST), bypasses
cache for authenticated users so CSRF tokens and edit controls stay fresh.
Migrates PesquisarSessaoPlenariaView from cache_page (all-users) to
AnonCachePageMixin for consistency.
Views covered: PesquisarParlamentarView, PesquisarColigacaoView,
PesquisarPartidoView, PesquisarStatusTramitacaoView,
MateriaLegislativaPesquisaView, PesquisarAssuntoNormaView,
NormaPesquisaView, PesquisarSessaoPlenariaView.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Coordinated DDoS attack used repeated page= params
(?page=3&page=2&page=3&...) as a scraping fingerprint, likely
harvested from SAPL's own paginated search links which had a
long-standing bug producing those polluted URLs.
Reject duplicate page= at the middleware layer:
RateLimitMiddleware.__call__ returns 400 (param_pollution) if
request.GET.getlist('page') has more than one value — before any
Redis or DB work runs, covering all paths universally.
PesquisarSessaoPlenariaView.get has the same check as a backstop.
Fix the root cause — page= leaking into filter_url on 9 search views:
All affected views built filter_url from the raw QUERY_STRING and
guarded with startswith("&page"), which only strips page= when it
is the first param. With ?filter=X&page=2 the page= leaked through
and paginacao.html produced ?page=N&filter=X&page=2 on every link.
Replaced with qr = request.GET.copy(); qr.pop('page', None).
Views fixed: PesquisarStatusTramitacaoView, PesquisarAssuntoNormaView,
PesquisarAuditLogView, PesquisarParlamentarView, PesquisarColigacaoView,
PesquisarPartidoView, ProtocoloPesquisaView,
PesquisarDocumentoAdministrativoView, PesquisarSessaoPlenariaView.
Cache anonymous GET on PesquisarSessaoPlenariaView (2 min TTL) to
reduce ORM load from repeated identical queries.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A permanent operator-curated deny decision (rl:ip_prefix:blocked) is not a
transient rate limit, so it should surface as 403 Forbidden with no
Retry-After rather than 429. Also brings RATE-LIMITER-PLAN.md up to date
with the IP-prefix blocklist feature and the same-origin bypass fix,
including redis-cli commands to populate/remove/list the deny-list set.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_handle_api checked _is_same_origin() before any of the IP-prefix, global
(rl:ip:<ip>:blocked), and API-specific (rl:api:ip:<ip>:blocked) block keys —
short-circuiting straight to get_response() on a match. Since Origin/Referer
are entirely client-controlled and trivially spoofable, any caller could
defeat every /api/ block and counter (including an operator-set global block)
by simply sending Origin: https://<sapl-host>.
Reorder the checks so IP-prefix/global/API block lookups always run first;
the same-origin bypass now only exempts legitimate same-origin polling from
quota and per-minute rate-limit accounting, never from an active block.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Operators can now SADD/SREM dotted-decimal prefixes (e.g. "103.124.225")
into rl:ip_prefix:blocked to block entire ranges of abusive traffic
network-wide, mirroring the existing runtime UA deny-list pattern
(periodic SMEMBERS refresh into an in-process cache). Checked first in
both _handle_api and _evaluate, ahead of all auth-aware exemptions, so
it applies universally to anonymous and authenticated traffic alike.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- RelatorioMateriasTramitacao: strip UI-only params before deciding whether
to run a query, short-circuit RelatorioMixin.get() on permission checks,
remove redundant per-view @ratelimit decorators (RateLimitMiddleware
already covers them with stricter checks), and cache get_report_urls_map()
with lru_cache
- RelatorioMateriasTramitacaoView: rewrite the materia_materiaemtramitacao
view (migration 0088) to use DISTINCT ON instead of a correlated subquery,
add a composite index on materia_tramitacao(materia_id, id DESC), and drop
the now-redundant .distinct() from the filterset queryset
- customize_link_materia (MateriaOrdemDiaCrud / ExpedienteMateriaCrud):
eliminate per-row N+1 queries via select_related/prefetch_related with
to_attr caches, resolve sessao_plenaria once per page, and use a flat
paginate_by=100 instead of the count()-dependent 50/None toggle
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add RATE_LIMITER_INDEX_SHARDS setting (default 3): each blocked-IP write
routes to rl:index:blocked_ips:{shard} via md5(ip) % N, distributing
write contention across N keys.
- _BLOCK_LUA now runs ZREMRANGEBYSCORE before ZADD, pruning expired entries
from the target shard inline. Each shard stays bounded to active-only
members; no separate maintenance job needed.
- _index_shard(ip, index_base) computes the sharded key; all four _set_block
call sites updated.
- Fix 5 pre-existing test failures: suspicious-headers tests needed
HTTP_USER_AGENT removed; auth_user_rate block assertion corrected (no
persistent block key by design); ip_rate / ua_rotation tests now mock
_set_block directly instead of checking mock_cache.set.
- Update RATE-LIMITER-PLAN.md: key schema table, Redis CLI examples, and
ZSET index description reflect sharded keys and inline pruning.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
RATE_LIMIT_WHITELIST_IPS setting, self.whitelist bypass checks in
RateLimitMiddleware, and the corresponding test are removed. The
bypass_paths mechanism covers the legitimate high-frequency paths
(/painel, /sessao, /voto-individual); no IP-level exemption is needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
RL_API_IP_REQUESTS and RL_API_IP_BLOCKED now include {ns} so a block
in one k8s pod namespace does not leak into other tenants sharing the
same Redis instance. RL_INDEX_API_BLOCKED_IPS remains namespace-free
(operational index; members carry the namespace in their key string).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
At 35 req/min the old 500/day cap fired in ~14 min, making it
redundant with the per-minute block. The new values target slow-drip
scrapers (10–20 req/min sustained all day) while leaving legitimate
integrations (< 500/day) well within budget.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Forces script/integration owners toward sane polling intervals.
35/min is still well above any legitimate use case (a live session
panel at 10 s intervals needs only 6/min). Threshold remains
env-configurable (API_RATE_LIMIT_THRESHOLD) for future adjustment.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Authenticating must not bypass /api/ rate controls. The per-minute
threshold (60/min) and daily/weekly quota (500/day · 3 500/week) now
apply to all callers keyed by IP — auth status is not checked.
Auth users no longer fall through to _evaluate for /api/ requests;
_evaluate (240/min per-user) still governs all non-/api/ paths.
QUOTA_USER_DAILY/WEEKLY key templates removed (no longer written).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Auth users now share the same 500/day · 3 500/week cap as anonymous
callers. Auth quota is keyed by user pk (NAT-safe); anon quota by IP.
Both limits come from a single pair of settings (API_QUOTA_DAILY /
API_QUOTA_WEEKLY), replacing the anon-only API_QUOTA_ANON_* names.
QUOTA_USER_DAILY/WEEKLY Redis key templates are restored accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Auth users hitting /api/ are already governed by _evaluate's per-user
240 req/min rate limit (NAT-safe, scoped by user pk). The per-day/week
envelope (5 000/day · 35 000/week) fired too early for legitimate
integrations and added needless complexity.
Anonymous callers retain their quota (500/day · 3 500/week) since
they have no persistent per-user rate control.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Delete ApiEmergencySameSiteOnlyMiddleware (api_emergency_block.py);
replace with permanent _handle_api logic inside RateLimitMiddleware.
- _handle_api decision chain: OPTIONS pass → same-origin pass →
global IP block check → API-specific block check → quota → API
per-minute counter → anon pass / auth _evaluate.
- New Redis keys: rl:api:ip:<ip>:reqs, rl:api:ip:<ip>:blocked,
rl:index:api_blocked_ips. Global rl:ip:<ip>:blocked is never
written because of /api/ abuse — prevents NAT lockout.
- _is_same_origin: strips port, lowercases, checks Origin first then
Referer (sequential, not OR — wrong Origin blocks even if Referer matches).
- Five new settings: API_RATE_LIMIT_{ENABLED,THRESHOLD,WINDOW_SECONDS,
BLOCK_SECONDS,SAME_ORIGIN_BYPASS} with safe defaults.
- 16 new tests; _make_middleware extended with explicit setting values.
- RATE-LIMITER-PLAN.md updated with new key schema rows and _handle_api section.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Anonymous API requests now pass through after the quota check without
incrementing rl:ip:{ip}:reqs or writing a block key. A misbehaving
script or JS snippet behind a NAT IP can no longer lock out the org's
page requests by hammering /api/.
Enforcement for anonymous /api/:
- nginx sapl_api zone (60r/m, burst=120) — burst gate
- API quota (500/day, 3500/week) — daily cap
Authenticated /api/ still falls through to _evaluate_authenticated
(per-user counter keyed by uid, NAT-safe).
Interim measure until APP_ACCESS_KEYs per tenant org are introduced.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SAPL pages fire 12-45 parallel requests; the old 30r/m nginx zone and
35/m Django threshold blocked normal navigation. Key changes:
nginx (nginx.conf / sapl.conf / start.sh):
- Split sapl_general (30r/m) into four dedicated zones:
sapl_general 90r/m burst=180 (HTML pages)
sapl_media 180r/m burst=180 (/media/ — own bucket, no longer drains general)
sapl_api 60r/m burst=120 (/api/ — quota layer is the real constraint)
sapl_heavy 10r/m burst=20 (/relatorios/ — unchanged, nodelay kept)
- /media/ and /api/ location blocks now reference their own zones
Django (settings.py):
- RATE_LIMITER_RATE: 35/m → 120/m
- RATE_LIMITER_RATE_AUTHENTICATED: 120/m → 240/m
- RATE_LIMIT_404_THRESHOLD: 10 → 20
- API_QUOTA_ANON_DAILY: 50 → 500 / weekly 350 → 3500
- API_QUOTA_AUTH_DAILY: 1000 → 5000 / weekly 7000 → 35000
Middleware (ratelimit.py):
- Authenticated users no longer receive a persistent 300s block key on
rate breach — they get 429 for the over-limit request and the window
resets naturally after 60s. A 5-minute lockout is wrong for a logged-in
user who clicked too fast.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Multiple councilmembers behind a shared NAT IP were getting 429s during
live votes because nginx burst exhaustion fired before Django's per-user
rate counter could run. Exempt /voto-individual/ and /sessao/<pk>/ from
nginx limit_req; mirror in RATE_LIMIT_BYPASS_PATHS as defense-in-depth.
Add docs/rate-limiter-incidents.md with root-cause analysis, architecture
diagrams, tradeoff discussion, and pending investigations for the
PatoBranco-PR 2026-05-06 incident.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Files under sapl/private/documentoadministrativo/ are public when the
AppConfig.documentos_administrativos setting is DOC_ADM_OSTENSIVO. The
previous gate blocked all sapl/private/ paths unconditionally, forcing
anonymous users to log in even for ostensivo documents.
_is_public_docadm() checks the cached AppConfig setting to exempt
ostensivo documents while keeping proposicao and restritivo documents
behind the auth redirect. Also fixes wrong import (sapl.base.apps.AppConfig
is Django's app-config class; the SAPL model is in sapl.base.models).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each block-key write now also ZADDs the full key name into a permanent ZSET
(score = expiry unix timestamp) via a single Lua round-trip (_BLOCK_LUA).
Replaces four _rl_cache.set() calls with _set_block() which degrades to a
plain cache.set when Redis is unavailable. Indexes enable O(log N) enumeration
of active blocks (ZRANGEBYSCORE) without a SCAN; prunable with ZREMRANGEBYSCORE.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- @never_cache on get_dados_painel: ConditionalGetMiddleware was computing
an ETag from the unchanged JsonResponse body and returning 304 with no
body; the display always needs a live response.
- Guard logo src update: jQuery .attr("src", ...) fired a browser HTTP GET
on every 500 ms poll even when the URL hadn't changed — 120 media
requests/min per user, hitting the auth_threshold and triggering
user_blocked for painel operators.
- Fix poll scheduling: setTimeout was evaluated at $.ajax options
construction time, scheduling the next poll 500 ms after the request
started rather than after it finished. Slow responses (>500 ms) stacked
concurrent in-flight requests, creating a self-amplifying load loop and
a DOM race condition where older responses could overwrite newer ones.
Moved to a proper complete callback so at most one request is in-flight
at any time.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- nginx: exempt /painel/<pk>/dados from rate limiting (polling endpoint,
will become WebSocket); dedicated location block with no limit_req
- ratelimit.py: bypass RATE_LIMIT_BYPASS_PATHS paths before _evaluate;
add layer=django to block log; increment daily Redis metrics counter
rl:metrics:{ns}:{date}:blocked:{reason} (TTL 8 days) on every block
- ratelimit.py: add quiltbot and AwarioBot to BOT_UA_FRAGMENTS
- ratelimit.py: fix _is_suspicious_headers to require missing UA before blocking
- settings: add RATE_LIMIT_BYPASS_PATHS with /painel/<pk>/dados pattern
- plan: extend UA blocklist SADD seed command with missing bot tokens
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- redis-configmap: move inline comment to its own line (Redis fatal parse error)
- settings: add CACHE_MIDDLEWARE_KEY_PREFIX='p' to remove double-dot in cache_page keys
- settings: monkey-patch _i18n_cache_key_suffix to strip pt-br/timezone suffix from keys
- ratelimit.py, settings: update example namespace from patobranco-pr to sapl31demo-df
- robots.txt: add AwarioSmartBot block
- plan: add rl:ip:*:blocked scan commands with TTL/value output
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Renames /private/media/ to /internal/media/ in nginx and serve_media().
Adds Content-Type and Content-Disposition to the X-Accel-Redirect response.
Replaces manual file reads in proposicao_texto and doc_texto_integral with
redirects to the media URL, removing the unused get_mime_type_from_file_extension helper.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>