Sistema de Apoio ao Processo Legislativo

39 KiB

Raw Blame History

SAPL — OOM Investigation & Remediation Plan (v2)

Scope: Django 2.2 / Gunicorn / nginx / Kubernetes fleet of 1,200+ pods.
Each pod has a dedicated PostgreSQL instance. A K8s Ingress sits in front of all tenants.
This document is canonical — all earlier session notes are consolidated here.

Architecture Overview
Context & Problem Statement
Decision Log
Phase 0 — Immediate Hardening (No New Infra)
Phase 1 — Shared Redis (Single Pod)
Phase 2 — Rate Limiting & Bot Mitigation
Phase 3 — File Serving Corrections
Phase 4 — Dynamic Page Caching
Open Questions

0. Architecture Overview

0.1 Component Diagram

graph TD
    Client([Bot / Human Client])
    nginx[nginx\nDebian pkg]
    gunicorn[Gunicorn\n2 workers / 4 threads]
    mw[Django Middleware\nRateLimitMiddleware]
    view[View Layer\nCBV + decorators]
    redis[(Redis\nDB0: cache\nDB1: rate limiter)]
    pg[(PostgreSQL\nper-pod)]
    fs[Filesystem\nPDFs / media]

    Client -->|HTTP| nginx
    nginx -->|proxy_pass| gunicorn
    gunicorn --> mw
    mw -->|pass| view
    mw -->|429| nginx
    view --> pg
    view --> fs
    view --> redis
    mw --> redis
    nginx -->|SISMEMBER / GET| redis

DB2 is reserved for Django Channels (WebSocket — future Phase 5).

0.2 Redis Memory Budget and Key Layout

Key type	Key schema	TTL	DB	Est. size
Static cache (images/logos)	`static:{ns}:{sha256}`	3–24 h	0	~2.4 GB
PDF cache (≤ 360 KB)	`file:{ns}:{sha256}`	1 h	0	~0.9 GB
IP request counter	`rl:ip:{ip}:reqs`	60 s	1	~0.6 MB
IP blocked marker	`rl:ip:{ip}:blocked`	300 s	1	~0.06 MB
User request counter	`rl:{ns}:user:{id}:reqs`	60 s	1	negligible
User blocked marker	`rl:{ns}:user:{id}:blocked`	300 s	1	negligible
Path counter	`rl:{ns}:path:{sha256}:reqs`	60 s	1	~0.3 MB
UA deny list	`rl:bot:ua:blocked`	permanent SET	1	~0.03 MB
NS/IP/window counter	`rl:ns:{ns}:ip:{ip}:w:{bucket}`	60 s × 2	1	~0.6 MB
Redis overhead (× 1.5)				~1.6 GB
Total ceiling				~5 GB

Key conventions:

{ns} = Kubernetes namespace (tenant identifier). All path and user keys include it.
{user} / {id} = normalized user PK: str(user.pk).lower().strip().
Django CACHES uses KEY_PREFIX: "sapl" to namespace DB0 cache keys.
DB1 (rate limiter) uses raw keys — no prefix — for compatibility with the Lua / middleware INCR scripts.
DB2 is reserved for Django Channels; allocate separately when WebSocket work resumes.

1. Context & Problem Statement

Fleet

Item	Detail
System	SAPL — Django 2.2, legislative management for Brazilian municipal chambers
Fleet	~1,200 Kubernetes pods, each with a dedicated PostgreSQL pod
Pod limits	1 core CPU (limit) / 35m (request) · 1600Mi RAM (limit) / 800Mi (request)
Users	Legislative house staff, often behind NAT (many users, one public IP)
Workloads	PDF generation (synchronous, ReportLab), file uploads up to 150 MB, WebSocket voting panel

OOM Kill Pattern

Workers grow from ~35 MB at birth to 800–900 MB within 2–3 minutes, then are killed and replaced in a continuous cycle.

Root causes:

Bot scraping triggers synchronous PDF generation — entire document built in RAM (ReportLab)
worker_max_memory_per_child only checks between requests; workers blocked on long requests are never recycled
TIMEOUT=300 lets bots hold threads for up to 5 minutes while memory accumulates
3 workers × 300 MB each = ~900 MB — breaching the 800Mi request threshold

Bot Traffic Profile (Barueri pod, 16 days, 662k requests)

Actor	Requests	% of total
Googlebot	~154,000	23.2%
Chrome/98.0.4758 (spoofed scraper)	90,774	13.7%
kube-probe (healthcheck)	69,065	10.4%
meta-externalagent	28,325	4.3%
GPTBot	11,489	1.7%
bingbot	7,639	1.1%
OAI-SearchBot + Applebot	6,681	1.0%
Total identified bots	~377,000	~56.9%

Botnet fingerprint:

Rotates User-Agents (Chrome/121, Chrome/122, Firefox/123, Safari/17…) across requests
Crawls all sub-endpoints of the same matéria within 1 second from different IPs
Distributes crawling across tenants — each pod stays under the per-pod rate limit, never triggering it
Primary targets: /relatorios/{id}/etiqueta-materia-legislativa (~40 KB PDF) and all /materia/{id}/* sub-endpoints

Static File Traffic (from CSV analysis)

Category	Requests	Transfers
Logos / images	62,776	~24 GB
PDFs	8,869	5.1 GB
Parliamentarian photos	11,856	~0.5 GB
Total	83,501	~30 GB

Top offender: Brasão - Foz do Iguaçu.png — 14,512 requests, 5.6 GB from a single 392 KB file.

Confirmed Bugs

# nginx.conf — WRONG (disables kernel bypass)
sendfile off;

# sapl.conf — missing on /media/ location
location /media/ {
    alias /var/interlegis/sapl/media/;
    # no ETag, no Cache-Control, no X-Robots-Tag
}

# settings.py — per-pod cache, not shared
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.filebased.FileBasedCache',
        'LOCATION': '/var/tmp/django_cache',
        'OPTIONS': {"MAX_ENTRIES": 10000},
    }
}

Django rate limiter (django-ratelimit at 35/m) uses FileBasedCache — counters are isolated per pod, making rate limiting completely ineffective at fleet scale.

Hard Constraints

Constraint	Impact
Per-pod PostgreSQL	Rate-limit counters not shared across pods
No Redis initially	No shared state for rate limiting or caching
NAT environments	IP-based rate limiting causes false positives
`TIMEOUT=300` / uploads to 150 MB	Must not be broken — intentional for slow workflows

2. Decision Log

Decision	Chosen	Rationale	Session
Redis topology	Single pod (no Sentinel, no Cluster)	65 MB of active data fits comfortably on one node; cluster complexity not justified at this data volume	v2
PDF caching in Redis	No — ETags + sendfile are sufficient	Once rate limiting + ETags are active, repeat requests become 304s with zero bytes transferred	Session 4
nginx rate-limit end state	Django middleware with shared Redis	No nginx image changes required; solves cross-pod consistency immediately	Session 5
`worker_max_memory_per_child`	400 MB	Pod limit 1600Mi, 2 workers × 400 MB = 800 MB — leaves 800 Mi headroom; previous 300 MB was OOMKilled before recycling could act	v2
`sendfile off`	Bug — flip to `on`	No valid production reason found in uploaded config; disabling userspace copy is always correct	Session 5
nginx serves `/media/` directly	Confirmed via `alias` in `sapl.conf`	`X-Accel-Redirect` only needed for LGPD-restricted documents	Session 5
Cache backend switch timing	At pod startup via `start.sh` + waffle switch	Pod restart is acceptable; avoids per-request runtime overhead	Session 5
Secret injection	Per-namespace Secret with `optional: true`	Enables gradual rollout; pod starts on file cache if Secret is absent	Session 5
Redis k8s files location	`$PROJECT_ROOT/docker/k8s/`	Consistent with existing Docker artifacts in the repo	v2

3. Phase 0 — Immediate Hardening (No New Infra)

Goal: Stop the OOM kill cycle and reduce bot load with zero infrastructure additions.
Risk: Low — all changes are config-only.

3.1 Gunicorn Tuning

The core tension: reducing workers protects memory but reduces concurrency. The fix is to reduce the number of workers (from 3 to 2) and raise the per-worker ceiling so the recycling mechanism has time to act.

# docker/startup_scripts/gunicorn.conf.py
import os
import pathlib

NAME     = "SAPL"
DJANGODIR = "/var/interlegis/sapl"
SOCKFILE  = f"unix:{DJANGODIR}/run/gunicorn.sock"
USER  = "sapl"
GROUP = "nginx"

NUM_WORKERS = int(os.getenv("WEB_CONCURRENCY", "2"))      # was 3
THREADS     = int(os.getenv("GUNICORN_THREADS", "4"))      # was 8
TIMEOUT     = int(os.getenv("GUNICORN_TIMEOUT", "120"))    # was 300
WORKER_CLASS     = "gthread"
DJANGO_SETTINGS  = "sapl.settings"
WSGI_APP         = "sapl.wsgi:application"

proc_name = NAME
bind      = SOCKFILE
umask     = 0o007
user      = USER
group     = GROUP
chdir     = DJANGODIR
wsgi_app  = WSGI_APP

loglevel    = "info"          # was debug — reduces log I/O
accesslog   = "/var/log/sapl/access.log"
errorlog    = "/var/log/sapl/error.log"
capture_output = True

workers          = NUM_WORKERS
worker_class     = WORKER_CLASS
threads          = THREADS
timeout          = TIMEOUT
graceful_timeout = 30
keepalive        = 10
backlog          = 2048

max_requests        = 1000
max_requests_jitter = 200
worker_max_memory_per_child = 400 * 1024 * 1024  # 400 MB — was 300 MB

raw_env = [f"DJANGO_SETTINGS_MODULE={DJANGO_SETTINGS}"]
preload_app = False

def on_starting(server):
    pathlib.Path(SOCKFILE).parent.mkdir(parents=True, exist_ok=True)

def post_fork(server, worker):
    try:
        from django import db
        db.connections.close_all()
    except Exception:
        pass

Per-location timeout strategy — replace the one-size-fits-all 300s:

Operation	Previous	Recommended	Rationale
Normal page rendering	300 s	60 s	No legitimate page should take > 60 s
API endpoints	300 s	30 s	Stateless, fast by design
PDF download (cached / nginx)	300 s	30 s	nginx serves from disk, worker not involved
PDF generation (uncached)	300 s	180 s	Kept high — addressed in Phase 5
Large file upload	300 s	180 s	nginx buffers upload, worker processes after

3.2 nginx Fixes

Three confirmed bugs in the uploaded config — all fixed here.

# /etc/nginx/nginx.conf — http {} block

# FIX 1: kernel bypass (was off — CRITICAL)
sendfile    on;
tcp_nopush  on;
tcp_nodelay on;

# FIX 2: reduced timeouts (was 300s everywhere)
keepalive_timeout     75;
proxy_read_timeout    120s;    # overridden per-location for slow ops
proxy_connect_timeout 10s;
proxy_send_timeout    120s;

# Real client IP from X-Forwarded-For set by K8s Ingress
real_ip_header     X-Forwarded-For;
real_ip_recursive  on;
set_real_ip_from   10.0.0.0/8;
set_real_ip_from   172.16.0.0/12;
set_real_ip_from   192.168.0.0/16;

# sapl.conf — FIX 3: add caching headers to /media/
location /media/ {
    alias  /var/interlegis/sapl/media/;
    sendfile on;
    etag on;
    add_header Cache-Control "public, max-age=86400, stale-while-revalidate=3600";
    add_header X-Robots-Tag  "noindex" always;
}

Upload endpoints — keep proxy_request_buffering on so nginx absorbs slow uploads before handing off to Gunicorn:

location ~* ^/(protocoloadm/criar-protocolo|materia/.*upload|norma/.*upload) {
    proxy_request_buffering on;
    proxy_read_timeout  180s;
    proxy_send_timeout  180s;
    proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header Host              $http_host;
    proxy_redirect off;
    proxy_pass http://sapl_server;
}

3.3 Bot UA Blocklist in nginx

Blocks known bots at nginx — before any Gunicorn worker is allocated.

# nginx.conf — http {} block
map $http_user_agent $bot_ua_blocked {
    default                    0;
    "~*GPTBot"                 1;
    "~*ClaudeBot"              1;
    "~*PerplexityBot"          1;
    "~*Bytespider"             1;
    "~*AhrefsBot"              1;
    "~*SemrushBot"             1;
    "~*DotBot"                 1;
    "~*meta-externalagent"     1;
    "~*OAI-SearchBot"          1;
    "~*Chrome/98\.0\.4758"     1;  # confirmed scraper — no real user runs a 2022 browser in 2026
}

# sapl.conf — server {} block (before any location)
if ($bot_ua_blocked = 1) {
    return 429 "Too Many Requests";
}

Limitation: Bots with rotating or spoofed UAs are not caught here. They are handled by Django middleware in Phase 2 (checks 3–5). This is intentional — nginx handles the cheap deterministic case; Django handles the expensive probabilistic case.

3.4 ASN-Based Blocking (Mandatory)

Blocks bot traffic by datacenter ASN — before UA parsing, before any Python process is touched.

Step 1 — install the GeoIP2 module and database:

# Debian / Ubuntu
apt install libnginx-mod-http-geoip2 libmaxminddb0 mmdb-bin

# Download GeoLite2-ASN (free MaxMind account required)
mkdir -p /etc/nginx/geoip
curl -sL "https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-ASN&license_key=YOUR_KEY&suffix=tar.gz" \
  | tar -xz --strip-components=1 --wildcards '*.mmdb' -C /etc/nginx/geoip/

Step 2 — configure nginx:

# nginx.conf — top-level (outside http {})
load_module modules/ngx_http_geoip2_module.so;

# nginx.conf — http {} block
geoip2 /etc/nginx/geoip/GeoLite2-ASN.mmdb {
    $geoip2_asn_number autonomous_system_number;
    $geoip2_asn_org    autonomous_system_organization;
}

map $geoip2_asn_number $bot_asn {
    default  0;
    16509    1;  # Amazon AWS
    14618    1;  # Amazon AWS us-east
    8075     1;  # Microsoft Azure
    396982   1;  # Google Cloud
    20473    1;  # Vultr
    24940    1;  # Hetzner
    16276    1;  # OVH
    36352    1;  # ColoCrossing
    63949    1;  # Linode / Akamai
}

# sapl.conf — server {} block (before bot_ua_blocked check)
if ($bot_asn = 1) {
    return 429 "Too Many Requests";
}

Step 3 — keep the database fresh (host cron — no k8s CronJob):

# /etc/cron.weekly/update-geoip
#!/bin/bash
curl -sL "https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-ASN&license_key=${MAXMIND_KEY}&suffix=tar.gz" \
  | tar -xz -C /tmp --wildcards '*.mmdb'
mv /tmp/GeoLite2-ASN_*/GeoLite2-ASN.mmdb /etc/nginx/geoip/GeoLite2-ASN.mmdb
nginx -s reload

Tradeoff: Blocks datacenter ASNs where bots originate. May over-block VPN users and developers on cloud instances — mitigate with a per-namespace IP whitelist once available (see Open Question 2).

3.5 robots.txt

Passive mitigation — effective over days/weeks for compliant bots. The spoofed Chrome/98 botnet ignores it; handled by nginx UA blocking above.

# Place at /var/interlegis/sapl/collected_static/robots.txt
User-agent: GPTBot
Disallow: /
Crawl-delay: 10

User-agent: ClaudeBot
Disallow: /
Crawl-delay: 10

User-agent: meta-externalagent
Disallow: /
Crawl-delay: 10

User-agent: OAI-SearchBot
Disallow: /
Crawl-delay: 10

User-agent: *
Disallow: /relatorios/
Crawl-delay: 10

Serve directly from nginx (no Django involvement):

# sapl.conf
location = /robots.txt {
    alias /var/interlegis/sapl/collected_static/robots.txt;
}

3.6 N+1 Fix in `get_etiqueta_protocolos`

Confirmed in sapl/protocoloadm/utils.py — MateriaLegislativa.objects.filter() called inside a loop over protocols. Two queries total regardless of volume:

# BEFORE — one query per protocol (N+1)
def get_etiqueta_protocolos(prots):
    protocolos = []
    for p in prots:
        dic = {}
        for materia in MateriaLegislativa.objects.filter(
                numero_protocolo=p.numero, ano=p.ano):
            dic['num_materia'] = (
                materia.tipo.sigla + ' ' +
                str(materia.numero) + '/' + str(materia.ano)
            )
        protocolos.append(dic)
    return protocolos


# AFTER — two queries total regardless of volume
def get_etiqueta_protocolos(prots):
    from django.db.models import Q
    import functools, operator

    prot_list = list(prots)
    if not prot_list:
        return []

    query = functools.reduce(
        operator.or_,
        [Q(numero_protocolo=p.numero, ano=p.ano) for p in prot_list]
    )
    materias_map = {
        (m.numero_protocolo, m.ano): m
        for m in MateriaLegislativa.objects.filter(query).select_related('tipo')
    }

    protocolos = []
    for p in prot_list:
        dic = {}
        materia = materias_map.get((p.numero, p.ano))
        dic['num_materia'] = (
            f"{materia.tipo.sigla} {materia.numero}/{materia.ano}"
            if materia else ''
        )
        # ... rest of existing loop body unchanged
        protocolos.append(dic)
    return protocolos

3.7 ETags / 304 Responses

Adding etag on and Cache-Control to the /media/ location (§3.2) converts repeat bot requests from full downloads to 304 responses with empty bodies.

For Brasão - Foz do Iguaçu.png (392 KB × 14,512 requests = 5.6 GB), even a 50% conditional hit rate saves ~2.8 GB immediately — without any Redis.

Why this is sufficient for PDFs: See Phase 3 §6.2.

3.8 Django Upload Settings

# sapl/settings.py
# Files above 2 MB are streamed to a temp file on disk rather than
# held in worker RAM. Critical for 150 MB upload support.
FILE_UPLOAD_MAX_MEMORY_SIZE = 2 * 1024 * 1024       # 2 MB
DATA_UPLOAD_MAX_MEMORY_SIZE = 10 * 1024 * 1024      # 10 MB
MAX_DOC_UPLOAD_SIZE         = 150 * 1024 * 1024     # 150 MB
FILE_UPLOAD_TEMP_DIR        = '/var/interlegis/sapl/tmp'

4. Phase 1 — Shared Redis (Single Pod)

Goal: Deploy Redis so all subsequent phases have shared state.
Risk: Medium — new stateful infrastructure. Non-fatal fallback to file cache if Redis is unreachable.

4.1 Redis Kubernetes Manifests

Files live under $PROJECT_ROOT/docker/k8s/.

# docker/k8s/redis-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-config
  namespace: redis
data:
  redis.conf: |
    save ""
    appendonly no

    maxmemory 5gb
    maxmemory-policy allkeys-lru
    maxmemory-samples 10

    maxclients 20000
    tcp-backlog 511
    timeout 300
    tcp-keepalive 60

    hz 20
    lazyfree-lazy-eviction yes
    lazyfree-lazy-expire yes
    lazyfree-lazy-server-del yes

    slowlog-log-slower-than 10000
    slowlog-max-len 256
    latency-monitor-threshold 10

    bind 0.0.0.0
    protected-mode no
    databases 4     # DB0: cache, DB1: rate limiter, DB2: channels (future)

    activedefrag yes
    active-defrag-ignore-bytes 100mb
    active-defrag-threshold-lower 10

# docker/k8s/redis-pod.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sapl-redis
  namespace: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sapl-redis
  template:
    metadata:
      labels:
        app: sapl-redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        command: ["redis-server", "/etc/redis/redis.conf"]
        resources:
          requests:
            memory: "1Gi"
            cpu: "250m"
          limits:
            memory: "6Gi"
            cpu: "1000m"
        ports:
        - containerPort: 6379
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis
      volumes:
      - name: redis-config
        configMap:
          name: redis-config

# docker/k8s/redis-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: sapl-redis
  namespace: redis
spec:
  selector:
    app: sapl-redis
  ports:
  - port: 6379
    targetPort: 6379

Pod budget rationale:

Data type	Estimated memory
Rate limit counters (all pods, all IPs)	~50–110 MB
View / template cache	~300–600 MB
Small file cache (logos, etiquetas)	~500 MB–1 GB
Redis overhead (× 1.5)	~1.6 GB
Total ceiling	~5 GB

4.2 Use-Case / Key-Prefix Mapping

Use case	Key prefix	DB	TTL	Notes
Page / view cache	`sapl:cache:*`	0	60–3600 s	`KEY_PREFIX=sapl` in Django CACHES
Static file cache (logos)	`static:{ns}:{sha256}`	0	3–24 h	ns = namespace/tenant
PDF cache (≤ 360 KB)	`file:{ns}:{sha256}`	0	1 h	ns required
Rate limiter counters	`rl:*`	1	60–300 s	Raw keys, no prefix
UA deny list	`rl:bot:ua:blocked`	1	permanent SET	Seed once after deploy
WebSocket / Channels	`channels:*`	2	session TTL	Future — Phase 5

4.3 Django Settings — Startup-Time Backend Selection

# sapl/settings.py
REDIS_URL     = config('REDIS_URL',     default='')
CACHE_BACKEND = config('CACHE_BACKEND', default='file')

_redis_ready = CACHE_BACKEND == 'redis' and bool(REDIS_URL)

CACHES = {
    'default': {
        'BACKEND': (
            'django_redis.cache.RedisCache' if _redis_ready
            else 'django.core.cache.backends.filebased.FileBasedCache'
        ),
        'LOCATION': REDIS_URL + '/0' if _redis_ready else '/var/tmp/django_cache',
        'KEY_PREFIX': 'sapl',
        **(
            {
                'OPTIONS': {
                    'CLIENT_CLASS': 'django_redis.client.DefaultClient',
                    'CONNECTION_POOL_KWARGS': {
                        # 1,200 pods × 2 workers × 6 = 14,400 peak connections
                        # maxclients=20,000 gives 40% headroom
                        'max_connections': 6,
                        'socket_timeout': 0.5,
                        'socket_connect_timeout': 0.5,
                    },
                    'IGNORE_EXCEPTIONS': True,  # cache miss on Redis failure — app degrades gracefully
                },
                'TIMEOUT': 300,
            } if _redis_ready else {
                'OPTIONS': {'MAX_ENTRIES': 10000},
            }
        ),
    },
    'ratelimit': {
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': REDIS_URL + '/1' if _redis_ready else '',
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
            'CONNECTION_POOL_KWARGS': {
                'max_connections': 6,
                'socket_timeout': 0.5,
                'socket_connect_timeout': 0.5,
            },
            'IGNORE_EXCEPTIONS': True,
        },
    } if _redis_ready else {
        'BACKEND': 'django.core.cache.backends.filebased.FileBasedCache',
        'LOCATION': '/var/tmp/django_ratelimit_cache',
        'OPTIONS': {'MAX_ENTRIES': 5000},
    },
}

RATELIMIT_USE_CACHE = 'ratelimit'

start.sh additions — resolve URL and read waffle switch before Gunicorn starts:

resolve_redis_url() {
    # 1. Already set by local Secret via envFrom — highest precedence
    [[ -n "${REDIS_URL:-}" ]] && { log "REDIS_URL from local secret."; return 0; }

    # 2. Try global cluster Secret via k8s API
    local api="https://kubernetes.default.svc"
    local token ca
    token="$(<'/var/run/secrets/kubernetes.io/serviceaccount/token')"
    ca="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"

    local url
    url=$(curl -sf --cacert "$ca" \
        -H "Authorization: Bearer $token" \
        "${api}/api/v1/namespaces/interlegis-infra/secrets/sapl-global-redis" \
        | python3 -c "
import sys, json, base64
d = json.load(sys.stdin).get('data', {})
v = d.get('REDIS_URL', '')
print(base64.b64decode(v).decode() if v else '')
" 2>/dev/null || echo "")

    if [[ -n "$url" ]]; then
        export REDIS_URL="$url"
        log "REDIS_URL from global cluster secret."
        return 0
    fi
    log "No REDIS_URL found — file-based cache will be used."
}

resolve_cache_backend() {
    [[ -z "${REDIS_URL:-}" ]] && return 0
    log "REDIS_URL set — checking REDIS_CACHE waffle switch..."
    local active
    active=$(psql "$DATABASE_URL" -At -v ON_ERROR_STOP=0 -c \
        "SELECT active FROM waffle_switch WHERE name='REDIS_CACHE' LIMIT 1;" \
        2>/dev/null || echo "")
    if [[ "$active" == "t" ]]; then
        log "REDIS_CACHE switch ON — activating Redis cache backend."
        export CACHE_BACKEND="redis"
    else
        log "REDIS_CACHE switch OFF — using file-based cache."
        export CACHE_BACKEND="file"
    fi
}

wait_for_redis() {
    [[ -z "${REDIS_URL:-}" ]] && return 0
    log "Checking Redis connectivity..."
    local host port
    host=$(python3 -c "from urllib.parse import urlparse; u=urlparse('${REDIS_URL}'); print(u.hostname or 'localhost')")
    port=$(python3 -c "from urllib.parse import urlparse; u=urlparse('${REDIS_URL}'); print(u.port or 6379)")
    local retries=10
    until python3 -c "import socket; s=socket.create_connection(('${host}',${port}),2); s.close()" 2>/dev/null; do
        retries=$((retries-1))
        [[ $retries -eq 0 ]] && { log "WARNING: Redis unreachable — continuing on file cache."; return 0; }
        log "Waiting for Redis... ($retries retries left)"
        sleep 2
    done
    log "Redis reachable at ${host}:${port}."
}

configure_redis_cache() {
    [[ -z "${REDIS_URL:-}" ]] && return 0
    log "Creating REDIS_CACHE waffle switch (default: off)"
    python3 manage.py waffle_switch REDIS_CACHE off --create
}

4.4 Rollout Sequence

# Enable Redis for one namespace
kubectl create secret generic sapl-redis \
  --namespace=fortaleza-ce \
  --from-literal=REDIS_URL="redis://sapl-redis.redis.svc.cluster.local:6379" \
  --dry-run=client -o yaml | kubectl apply -f -

kubectl exec -n fortaleza-ce deploy/sapl -- \
  python manage.py waffle_switch REDIS_CACHE on --create

kubectl rollout restart deployment/sapl -n fortaleza-ce

# Disable without removing secret
kubectl exec -n fortaleza-ce deploy/sapl -- \
  python manage.py waffle_switch REDIS_CACHE off
kubectl rollout restart deployment/sapl -n fortaleza-ce

# Fleet-wide rollout (parallel)
kubectl get namespaces -l app=sapl -o name | sed 's|namespace/||' | \
  xargs -P 10 -I{} kubectl exec -n {} deploy/sapl -- \
    python manage.py waffle_switch REDIS_CACHE on --create

kubectl get namespaces -l app=sapl -o name | sed 's|namespace/||' | \
  xargs -P 5 -I{} kubectl rollout restart deployment/sapl -n {}

Seed the UA deny list once after Redis is deployed:

kubectl exec -n redis deploy/sapl-redis -- redis-cli -n 1 \
  SADD rl:bot:ua:blocked \
    "$(echo -n 'GPTBot'          | sha256sum | cut -d' ' -f1)" \
    "$(echo -n 'ClaudeBot'       | sha256sum | cut -d' ' -f1)" \
    "$(echo -n 'PerplexityBot'   | sha256sum | cut -d' ' -f1)" \
    "$(echo -n 'Bytespider'      | sha256sum | cut -d' ' -f1)" \
    "$(echo -n 'AhrefsBot'       | sha256sum | cut -d' ' -f1)" \
    "$(echo -n 'meta-externalagent' | sha256sum | cut -d' ' -f1)"

# Add new offenders at runtime without restart
kubectl exec -n redis deploy/sapl-redis -- redis-cli -n 1 \
  SADD rl:bot:ua:blocked "$(echo -n 'NewBot/1.0' | sha256sum | cut -d' ' -f1)"

Production monitoring commands:

# Memory usage
kubectl exec -n redis deploy/sapl-redis -- redis-cli info memory \
  | grep -E 'used_memory_human|maxmemory_human|mem_fragmentation_ratio'

# Connection pressure
kubectl exec -n redis deploy/sapl-redis -- redis-cli info stats \
  | grep -E 'rejected_connections|instantaneous_ops_per_sec'

# Key distribution per DB
kubectl exec -n redis deploy/sapl-redis -- redis-cli info keyspace

# Slow log
kubectl exec -n redis deploy/sapl-redis -- redis-cli slowlog get 25

5. Phase 2 — Rate Limiting & Bot Mitigation

Goal: Effective cross-pod throttling using shared Redis.
Prerequisite: Phase 1 (Redis deployed and CACHE_BACKEND=redis).

5.1 Middleware Architecture

flowchart TD
    A([Request arrives at nginx]) --> B{SISMEMBER\nrl:bot:ua:blocked}
    B -->|hit| Z1[429 — zero Django cost]
    B -->|miss| C{GET\nrl:ip:blocked}
    C -->|exists| Z2[429 — zero Django cost]
    C -->|nil| D[proxy_pass to Gunicorn]
    D --> E{authenticated?}
    E -->|yes| F{INCR\nrl:{ns}:user:{id}:reqs\n>= 120/min?}
    E -->|no| G{suspicious\nheaders?}
    F -->|yes| Z3[SET user:blocked\n429]
    F -->|no| H[call view]
    G -->|yes| Z4[429]
    G -->|no| I{INCR\nrl:ip:reqs\n>= 30/min?}
    I -->|yes| Z5[SET ip:blocked\n429]
    I -->|no| J{INCR\nrl:ns:ip:window\n>= 30/min?}
    J -->|yes| Z6[SET ip:blocked\n429]
    J -->|no| H
    H --> K[Filesystem / ORM / Response]

5.2 RateLimitMiddleware Implementation

# sapl/middleware/ratelimit.py
import hashlib
import logging
import time

from django.conf import settings
from django.core.cache import caches
from django.http import HttpResponse

logger = logging.getLogger('sapl.ratelimit')

BOT_UA_FRAGMENTS = [
    'GPTBot', 'ClaudeBot', 'PerplexityBot',
    'Bytespider', 'AhrefsBot', 'meta-externalagent',
    'Chrome/98.0.4758',
]


def _sha256(s: str) -> str:
    return hashlib.sha256(s.encode()).hexdigest()


def _is_suspicious_headers(request) -> bool:
    # Real browsers send all three; bots frequently omit them
    missing = sum([
        not request.META.get('HTTP_ACCEPT_LANGUAGE'),
        not request.META.get('HTTP_ACCEPT'),
        not request.META.get('HTTP_REFERER'),
    ])
    return missing >= 2


def _get_ip(request) -> str:
    return (
        request.META.get('HTTP_X_FORWARDED_FOR', '').split(',')[0].strip()
        or request.META.get('REMOTE_ADDR', '')
    )


class RateLimitMiddleware:
    ANON_IP_THRESHOLD   = 30    # req/min — tune from dry-run data
    AUTH_USER_THRESHOLD = 120   # req/min
    BLOCK_TTL           = 300   # seconds

    def __init__(self, get_response):
        self.get_response = get_response
        self.dry_run = getattr(settings, 'RATELIMIT_DRY_RUN', True)
        self._rl_cache = caches['ratelimit']

    def __call__(self, request):
        decision = self._evaluate(request)
        if decision['action'] == 'block':
            logger.warning('ratelimit_block', extra={
                'ip':        decision['ip'],
                'reason':    decision['reason'],
                'ua':        request.META.get('HTTP_USER_AGENT', ''),
                'path':      request.path,
                'dry_run':   self.dry_run,
                'namespace': getattr(request, 'tenant', 'unknown'),
            })
            if not self.dry_run:
                return HttpResponse(status=429)
        return self.get_response(request)

    def _evaluate(self, request):
        ip = _get_ip(request)

        # Check 1: known UA (all requests)
        ua = request.META.get('HTTP_USER_AGENT', '')
        for fragment in BOT_UA_FRAGMENTS:
            if fragment.lower() in ua.lower():
                return {'action': 'block', 'reason': 'known_ua', 'ip': ip}

        # Check 2: IP blocked marker
        if self._rl_cache.get(f'rl:ip:{ip}:blocked'):
            if not getattr(request, 'user', None) or not request.user.is_authenticated:
                return {'action': 'block', 'reason': 'ip_blocked', 'ip': ip}

        if getattr(request, 'user', None) and request.user.is_authenticated:
            return self._evaluate_authenticated(request, ip)
        return self._evaluate_anonymous(request, ip)

    def _evaluate_authenticated(self, request, ip):
        user_id = str(request.user.pk).lower().strip()
        ns = getattr(request, 'tenant', 'global')

        if self._rl_cache.get(f'rl:{ns}:user:{user_id}:blocked'):
            return {'action': 'block', 'reason': 'user_blocked', 'ip': ip}

        if _is_suspicious_headers(request):
            return {'action': 'block', 'reason': 'suspicious_headers_auth', 'ip': ip}

        count = self._incr_with_ttl(f'rl:{ns}:user:{user_id}:reqs', ttl=60)
        if count >= self.AUTH_USER_THRESHOLD:
            self._rl_cache.set(f'rl:{ns}:user:{user_id}:blocked', 1,
                               timeout=self.BLOCK_TTL)
            return {'action': 'block', 'reason': 'auth_user_rate', 'ip': ip}

        return {'action': 'pass', 'ip': ip}

    def _evaluate_anonymous(self, request, ip):
        # Check 3: suspicious headers
        if _is_suspicious_headers(request):
            return {'action': 'block', 'reason': 'suspicious_headers', 'ip': ip}

        # Check 4: IP request rate
        count = self._incr_with_ttl(f'rl:ip:{ip}:reqs', ttl=60)
        if count >= self.ANON_IP_THRESHOLD:
            self._rl_cache.set(f'rl:ip:{ip}:blocked', 1, timeout=self.BLOCK_TTL)
            return {'action': 'block', 'reason': 'ip_rate', 'ip': ip}

        # Check 5: per-ns/ip/window (catches UA rotators)
        ns     = getattr(request, 'tenant', 'global')
        bucket = int(time.time() // 60)
        count  = self._incr_with_ttl(f'rl:ns:{ns}:ip:{ip}:w:{bucket}', ttl=120)
        if count >= self.ANON_IP_THRESHOLD:
            self._rl_cache.set(f'rl:ip:{ip}:blocked', 1, timeout=self.BLOCK_TTL)
            return {'action': 'block', 'reason': 'ua_rotation', 'ip': ip}

        return {'action': 'pass', 'ip': ip}

    def _incr_with_ttl(self, key: str, ttl: int) -> int:
        """Atomic INCR + EXPIRE — TTL only set on key creation."""
        lua = """
            local n = redis.call('INCR', KEYS[1])
            if n == 1 then redis.call('EXPIRE', KEYS[1], ARGV[1]) end
            return n
        """
        client = self._rl_cache._cache.get_client()
        return client.eval(lua, 1, key, ttl)

5.3 Settings Reference

# sapl/settings.py
MIDDLEWARE = [
    'sapl.middleware.ratelimit.RateLimitMiddleware',  # before session/auth
    'django.contrib.sessions.middleware.SessionMiddleware',
    # ... rest unchanged
]

# Start in dry-run — flip to False check-by-check after validation
RATELIMIT_DRY_RUN = config('RATELIMIT_DRY_RUN', default=True, cast=bool)

RATE_LIMITER_RATE               = config('RATE_LIMITER_RATE',               default='35/m')
RATE_LIMITER_RATE_AUTHENTICATED = config('RATE_LIMITER_RATE_AUTHENTICATED', default='120/m')
RATE_LIMITER_RATE_BOT           = config('RATE_LIMITER_RATE_BOT',           default='5/m')

# Optional / future — see Open Question 2
RATE_LIMIT_WHITELIST_IPS = config(
    'RATE_LIMIT_WHITELIST_IPS',
    default='',
    cast=lambda v: [x.strip() for x in v.split(',') if x.strip()]
)

5.4 Enforcement Graduation Order

Enable RATELIMIT_DRY_RUN=False one check at a time, in order of false-positive risk:

Order	Check	Risk	Condition to enable
1st	`known_ua`	Zero	UA strings are deterministic
2nd	`ip_blocked`	Zero	Key only set by prior proven-bad requests
3rd	`ip_rate`	Low	Threshold calibrated from dry-run data
4th	`suspicious_headers`	Medium	Confirmed no legitimate clients omit all 3 headers
5th	`ua_rotation` (ns/window)	Medium	NAT IP whitelist in place (see Open Question 2)

5.5 Decorator Migration

For views where django-ratelimit decorators already exist:

Endpoint type	Action	Reason
List views (GET)	Remove after Phase 2 stable	Middleware covers equivalent threshold
Detail views (GET)	Remove after Phase 2 stable	Middleware covers equivalent threshold
Search / filter views	Remove last	Expensive queries — keep stricter per-view limit
PDF / file generation	Keep permanently	Most expensive; per-view limit tighter than global
Write endpoints (POST/PUT/DELETE)	Keep permanently	Different abuse surface
Auth endpoints (login, reset)	Keep permanently	Credential stuffing; must be independent

6. Phase 3 — File Serving Corrections

Goal: Ensure nginx serves files correctly with kernel bypass and caching headers.
Risk: Low — config changes only.

6.1 Confirmed Architecture

nginx already serves /media/ directly via alias — Django is not involved in file serving for public media. X-Accel-Redirect is only needed for LGPD-restricted documents that must pass through Django for access control.

The corrected nginx.conf and sapl.conf are shown in Phase 0 §3.2. No additional changes needed here.

6.2 Why Redis is NOT Needed for PDFs

With the full mitigation stack active:

ASN blocking (Phase 0) drops datacenter bot traffic at nginx
UA blocking (Phase 0) drops known-UA bots at nginx
Shared Redis rate counters (Phase 2) enforce limits across all pods
ETags (Phase 0 §3.2) convert repeat requests to 304 with zero bytes transferred
sendfile on (Phase 0 §3.2) means disk reads bypass userspace entirely

Redis PDF caching would solve "high request volume reaching the file layer" — but that problem no longer exists once the above stack is active. Redis memory is better reserved for rate counters, page cache, and sessions.

6.3 File Serving Decision Matrix

File type	Size	Strategy
Logos / images	Any	nginx `alias` + `sendfile` + ETag + `Cache-Control`
Small PDFs	≤ 360 KB	nginx direct + ETag
Medium PDFs	360 KB – 2 MB	nginx direct + ETag + rate limit
Large PDFs	> 2 MB	nginx + strict rate limit; never Redis
LGPD-restricted	Any	Django view → `X-Accel-Redirect` → nginx (access control enforced)

7. Phase 4 — Dynamic Page Caching

Goal: Eliminate ORM queries for anonymous bot requests on list views.
Prerequisite: Phase 1 (shared Redis, CACHE_BACKEND=redis).

7.1 The Key Insight

Many SAPL list views (pesquisar-materia, norma, etc.) are not truly dynamic for anonymous users between edits. A bot hammering ?page=1 through ?page=100 triggers 100 ORM queries per pod. With Redis page cache, each unique URL is queried once per TTL across the entire fleet.

# views.py — apply to anonymous list views only
from django.views.decorators.cache import cache_page
from django.utils.decorators import method_decorator

@method_decorator(cache_page(60 * 5), name='dispatch')  # 5-minute TTL
class PesquisarMateriaView(FilterView):
    ...

Critical safety check: cache_page sets Cache-Control: private for authenticated sessions automatically. Verify this is working before deploying — accidentally caching a session-aware response is a data leak.

7.2 Cache TTL Guidelines

View type	TTL	Reasoning
Matéria list (anonymous)	300 s	Changes infrequently between sessions
Norma list (anonymous)	300 s	Same
Parlamentar list	3600 s	Changes rarely
Search results	60 s	Query-dependent, shorter TTL safer
Authenticated views	Never	`cache_page` respects this automatically
PDF generation	Never	Too large — serve from disk via nginx

8. Open Questions

#	Question	Status	Blocks
1	Does Chrome/98.0.4758 impersonator appear consistently in nginx access logs?	Needs investigation	Phase 0 UA block safety
2	Which legislative house IPs can be pre-whitelisted in `RATE_LIMIT_WHITELIST_IPS`?	We don't have this list yet — plan to obtain in the future. Setting is optional / future.	Phase 2 enforcement safety
3	Dockerfile scope	Single image for all tenants (confirmed). All path-based Redis keys include `{ns}`.	—
4	WebSocket voting panel priority	Separate project. Resumes after Redis is on k8s, bot siege addressed, and OOM pressure reduced.	Phase 5 sequencing
5	`CONN_MAX_AGE` tuning	Currently 300 s (`sapl/settings.py:272`). Evaluate whether to reduce given worker recycling at 400 MB.	Phase 0 tuning
6	k8s Redis manifests	Development artifacts go under `$PROJECT_ROOT/docker/k8s/` (redis-pod.yaml, redis-service.yaml, redis-configmap.yaml).	Phase 1 delivery

Document consolidated from multi-session architecture review — Edward / Interlegis SAPL infrastructure.

39 KiB Raw Blame History