mirror of https://github.com/interlegis/sapl.git
Browse Source
docker/Dockerfile:
- GeoIP offline build with MaxMind secret; optional build args for
graphviz, poppler, psql client; envsubst for nginx burst vars
docker/docker-compose.yaml:
- saplredis service (redis:7-alpine, allkeys-lru, 512 MB)
- REDIS_URL + CACHE_BACKEND wired into sapl service
docker/startup_scripts/start.sh:
- configure_redis_cache(): builds CACHES dict, sets REDIS_CACHE waffle
switch, falls back to file cache gracefully
- POD_NAMESPACE resolution (k8s Downward API → hostname fallback)
- DATABASE_URL exported before migrate
docker/k8s/redis/ (moved from docker/k8s/):
- redis-configmap.yaml, redis-deployment.yaml, redis-service.yaml
- ClusterIP service on port 6379, sapl-redis namespace
docker/k8s/sapl-k8s.yaml:
- REDIS_URL env var injected; app.kubernetes.io/name=sapl label for
fleet-wide discovery
sapl/middleware/test_ratelimiter.py:
- Unit tests for RateLimitMiddleware with mocked Redis
scripts/test_ratelimiter.py:
- CLI smoke-test: fires N requests and reports first 429
Removed: rate-limiter-v2.md (content migrated to plan/RATE_LIMITER_PLAN.md),
scripts/test_ratelimiter.sh (replaced by .py),
docker/k8s/README.md (merged into plan/RATE_LIMITER_PLAN.md),
docker/scripts/redis_populate_test_data.py (renamed to redis_inject_test_data.py)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rate-limiter-2026
14 changed files with 514 additions and 1732 deletions
@ -1,228 +0,0 @@ |
|||
# SAPL — Kubernetes Redis |
|||
|
|||
Manifests for the shared Redis instance used by all SAPL pods for |
|||
cross-pod rate limiting (DB 1) and view/static-file caching (DB 0). |
|||
|
|||
--- |
|||
|
|||
## Directory layout |
|||
|
|||
``` |
|||
docker/k8s/ |
|||
├── redis-configmap.yaml # redis.conf — no persistence, allkeys-lru, 5 GB ceiling |
|||
├── redis-deployment.yaml # Deployment (1 replica, redis:7-alpine) |
|||
├── redis-service.yaml # ClusterIP service on port 6379 |
|||
└── README.md # this file |
|||
``` |
|||
|
|||
--- |
|||
|
|||
## Prerequisites |
|||
|
|||
- `kubectl` configured to talk to the target cluster. |
|||
- A `redis` namespace (created below if it doesn't exist). |
|||
|
|||
--- |
|||
|
|||
## Deploy |
|||
|
|||
```bash |
|||
# 1. Create the namespace (idempotent) |
|||
kubectl create namespace redis --dry-run=client -o yaml | kubectl apply -f - |
|||
|
|||
# 2. Apply all three manifests |
|||
kubectl apply -f docker/k8s/redis-configmap.yaml |
|||
kubectl apply -f docker/k8s/redis-deployment.yaml |
|||
kubectl apply -f docker/k8s/redis-service.yaml |
|||
|
|||
# 3. Verify the pod is Running |
|||
kubectl -n redis get pods -l app=sapl-redis |
|||
``` |
|||
|
|||
Expected output: |
|||
``` |
|||
NAME READY STATUS RESTARTS AGE |
|||
sapl-redis-6d9f8b7c4d-xk2lm 1/1 Running 0 30s |
|||
``` |
|||
|
|||
--- |
|||
|
|||
## Wire a SAPL namespace to Redis |
|||
|
|||
```bash |
|||
# Create the per-namespace Secret (one-off per tenant) |
|||
kubectl create secret generic sapl-redis \ |
|||
--namespace=<NAMESPACE> \ |
|||
--from-literal=REDIS_URL="redis://sapl-redis.redis.svc.cluster.local:6379" \ |
|||
--dry-run=client -o yaml | kubectl apply -f - |
|||
|
|||
# Ensure the waffle switch row exists (starts OFF) |
|||
kubectl exec -n <NAMESPACE> deploy/sapl -- \ |
|||
python manage.py waffle_switch REDIS_CACHE off --create |
|||
|
|||
# Enable Redis for this namespace |
|||
kubectl exec -n <NAMESPACE> deploy/sapl -- \ |
|||
python manage.py waffle_switch REDIS_CACHE on |
|||
|
|||
# Rolling restart so start.sh picks up the new switch value |
|||
kubectl rollout restart deployment/sapl -n <NAMESPACE> |
|||
kubectl rollout status deployment/sapl -n <NAMESPACE> |
|||
``` |
|||
|
|||
### Fleet-wide rollout |
|||
|
|||
```bash |
|||
kubectl get namespaces -l app=sapl -o name | sed 's|namespace/||' | \ |
|||
xargs -P 10 -I{} kubectl exec -n {} deploy/sapl -- \ |
|||
python manage.py waffle_switch REDIS_CACHE on --create |
|||
|
|||
kubectl get namespaces -l app=sapl -o name | sed 's|namespace/||' | \ |
|||
xargs -P 5 -I{} kubectl rollout restart deployment/sapl -n {} |
|||
``` |
|||
|
|||
### Roll back (without removing the Secret) |
|||
|
|||
```bash |
|||
kubectl exec -n <NAMESPACE> deploy/sapl -- \ |
|||
python manage.py waffle_switch REDIS_CACHE off |
|||
kubectl rollout restart deployment/sapl -n <NAMESPACE> |
|||
``` |
|||
|
|||
--- |
|||
|
|||
## Monitor |
|||
|
|||
### Pod and events |
|||
|
|||
```bash |
|||
# Pod status |
|||
kubectl -n redis get pods -l app=sapl-redis -o wide |
|||
|
|||
# Deployment events (useful right after apply) |
|||
kubectl -n redis describe deployment sapl-redis |
|||
|
|||
# Pod events (OOMKill, restarts, etc.) |
|||
kubectl -n redis describe pod -l app=sapl-redis |
|||
``` |
|||
|
|||
### Logs |
|||
|
|||
```bash |
|||
# Tail live logs |
|||
kubectl -n redis logs -f deploy/sapl-redis |
|||
|
|||
# Last 100 lines |
|||
kubectl -n redis logs deploy/sapl-redis --tail=100 |
|||
``` |
|||
|
|||
### Redis INFO |
|||
|
|||
```bash |
|||
# Memory usage |
|||
kubectl exec -n redis deploy/sapl-redis -- \ |
|||
redis-cli info memory \ |
|||
| grep -E 'used_memory_human|maxmemory_human|mem_fragmentation_ratio' |
|||
|
|||
# Connection pressure |
|||
kubectl exec -n redis deploy/sapl-redis -- \ |
|||
redis-cli info stats \ |
|||
| grep -E 'rejected_connections|instantaneous_ops_per_sec' |
|||
|
|||
# Key distribution per DB |
|||
kubectl exec -n redis deploy/sapl-redis -- redis-cli info keyspace |
|||
|
|||
# Recent slow queries |
|||
kubectl exec -n redis deploy/sapl-redis -- redis-cli slowlog get 10 |
|||
|
|||
# Live command sampling (1-second window) |
|||
kubectl exec -n redis deploy/sapl-redis -- redis-cli --latency-history -i 1 |
|||
``` |
|||
|
|||
### Rate-limiter keys (DB 1) |
|||
|
|||
```bash |
|||
kubectl exec -n redis deploy/sapl-redis -- \ |
|||
redis-cli -n 1 dbsize |
|||
|
|||
kubectl exec -n redis deploy/sapl-redis -- \ |
|||
redis-cli -n 1 --scan --pattern 'rl:ip:*' | head -20 |
|||
``` |
|||
|
|||
--- |
|||
|
|||
## Seed the UA deny list (once after first deploy) |
|||
|
|||
```bash |
|||
kubectl exec -n redis deploy/sapl-redis -- redis-cli -n 1 \ |
|||
SADD rl:bot:ua:blocked \ |
|||
"$(echo -n 'GPTBot' | sha256sum | cut -d' ' -f1)" \ |
|||
"$(echo -n 'ClaudeBot' | sha256sum | cut -d' ' -f1)" \ |
|||
"$(echo -n 'PerplexityBot' | sha256sum | cut -d' ' -f1)" \ |
|||
"$(echo -n 'Bytespider' | sha256sum | cut -d' ' -f1)" \ |
|||
"$(echo -n 'AhrefsBot' | sha256sum | cut -d' ' -f1)" \ |
|||
"$(echo -n 'meta-externalagent' | sha256sum | cut -d' ' -f1)" |
|||
|
|||
# Add a new offender at runtime (no restart required) |
|||
kubectl exec -n redis deploy/sapl-redis -- redis-cli -n 1 \ |
|||
SADD rl:bot:ua:blocked "$(echo -n 'NewBot/1.0' | sha256sum | cut -d' ' -f1)" |
|||
``` |
|||
|
|||
--- |
|||
|
|||
## Local standalone Redis (development / testing) |
|||
|
|||
No Kubernetes? Run Redis directly with Docker: |
|||
|
|||
```bash |
|||
sudo docker run --rm -p 6379:6379 redis:7-alpine \ |
|||
redis-server --save "" --appendonly no |
|||
``` |
|||
|
|||
Then point Django at it by exporting the env var before starting the dev server: |
|||
|
|||
```bash |
|||
export REDIS_URL="redis://localhost:6379" |
|||
export CACHE_BACKEND="redis" |
|||
python manage.py runserver |
|||
``` |
|||
|
|||
Or add them to your local `.env` file: |
|||
|
|||
``` |
|||
REDIS_URL=redis://localhost:6379 |
|||
CACHE_BACKEND=redis |
|||
``` |
|||
|
|||
> **Note**: the waffle switch `REDIS_CACHE` must also be `on` in your local |
|||
> database for `start.sh` to activate the Redis backend. Run: |
|||
> ```bash |
|||
> python manage.py waffle_switch REDIS_CACHE on --create |
|||
> ``` |
|||
|
|||
--- |
|||
|
|||
## Update `redis.conf` without redeploying |
|||
|
|||
```bash |
|||
# Edit the ConfigMap |
|||
kubectl -n redis edit configmap redis-config |
|||
|
|||
# Restart the pod to pick up the new config |
|||
kubectl -n redis rollout restart deployment/sapl-redis |
|||
``` |
|||
|
|||
--- |
|||
|
|||
## Key schema reference |
|||
|
|||
| DB | Use case | Key pattern | TTL | |
|||
|----|----------|-------------|-----| |
|||
| 0 | Page / view cache | `sapl:cache:*` | 60 – 3 600 s | |
|||
| 0 | Static file cache (logos) | `static:{ns}:{sha256}` | 3 – 24 h | |
|||
| 0 | PDF cache (≤ 360 KB) | `file:{ns}:{sha256}` | 1 h | |
|||
| 1 | IP rate-limit counter | `rl:ip:{ip}:reqs` | 60 s | |
|||
| 1 | IP blocked marker | `rl:ip:{ip}:blocked` | 300 s | |
|||
| 1 | User rate-limit counter | `rl:{ns}:user:{id}:reqs` | 60 s | |
|||
| 1 | Path counter | `rl:{ns}:path:{sha256}:reqs` | 60 s | |
|||
| 1 | UA deny list | `rl:bot:ua:blocked` | permanent SET | |
|||
| 2 | Django Channels (future) | `channels:*` | session TTL | |
|||
@ -1,8 +1,8 @@ |
|||
apiVersion: v1 |
|||
kind: Service |
|||
metadata: |
|||
name: sapl-redis |
|||
namespace: redis |
|||
name: redis |
|||
namespace: sapl-redis |
|||
labels: |
|||
app: sapl-redis |
|||
spec: |
|||
@ -1,176 +0,0 @@ |
|||
#!/usr/bin/env python3 |
|||
""" |
|||
redis_populate_test_data.py — inject synthetic rate-limiter entries into Redis. |
|||
|
|||
Purpose: validate that RateLimitMiddleware reads the expected key schema, |
|||
that Redis CLI / RedisInsight shows the right structure, and that blocking |
|||
logic fires correctly without waiting for real traffic. |
|||
|
|||
Usage: |
|||
# Against docker-compose Redis (default) |
|||
python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
# Against a different host/port |
|||
REDIS_URL=redis://localhost:6379 python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
# Show what would be written without actually writing |
|||
DRY_RUN=1 python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
# Clear all synthetic keys written by a previous run |
|||
CLEAR=1 python3 docker/scripts/redis_populate_test_data.py |
|||
|
|||
Key schema (DB 1 — rate limiter): |
|||
rl:ip:{ip}:reqs INCR counter — anonymous request count (TTL 60s) |
|||
rl:ip:{ip}:blocked string "1" — IP hard-blocked (TTL 300s) |
|||
rl:{ns}:user:{uid}:reqs INCR counter — auth user request count (TTL 60s) |
|||
rl:{ns}:user:{uid}:blocked string "1" — user hard-blocked (TTL 300s) |
|||
rl:{ns}:ip:{ip}:w:{bucket} INCR — namespace/IP sliding window (TTL 120s) |
|||
""" |
|||
|
|||
import os |
|||
import sys |
|||
import time |
|||
|
|||
# ── dependency check ────────────────────────────────────────────────────── |
|||
try: |
|||
import redis |
|||
except ImportError: |
|||
print("ERROR: redis-py not installed. Run: pip install redis", file=sys.stderr) |
|||
sys.exit(1) |
|||
|
|||
# ── config ──────────────────────────────────────────────────────────────── |
|||
REDIS_URL = os.environ.get("REDIS_URL", "redis://localhost:6379") |
|||
RATELIMIT_DB = 1 # DB1 is the rate-limiter database |
|||
DRY_RUN = os.environ.get("DRY_RUN", "0").lower() in ("1", "true", "yes") |
|||
CLEAR = os.environ.get("CLEAR", "0").lower() in ("1", "true", "yes") |
|||
|
|||
# Synthetic values — tweak to exercise different code paths |
|||
NAMESPACE = "sapl" # POD_NAMESPACE value (hostname or k8s namespace) |
|||
ANON_WINDOW = 60 # seconds — must match settings.RATE_LIMITER_RATE period |
|||
AUTH_WINDOW = 60 |
|||
BLOCK_TTL = 300 |
|||
|
|||
TEST_IPS = [ |
|||
"203.0.113.1", # below threshold (20 reqs) |
|||
"203.0.113.2", # AT threshold (35 reqs — should trigger block) |
|||
"203.0.113.3", # already blocked |
|||
"203.0.113.4", # namespace/window counter near threshold |
|||
] |
|||
|
|||
TEST_USERS = [ |
|||
{"uid": "42", "reqs": 50, "blocked": False}, # normal auth user |
|||
{"uid": "99", "reqs": 120, "blocked": False}, # AT auth threshold |
|||
{"uid": "7", "reqs": 10, "blocked": True}, # pre-blocked user |
|||
] |
|||
|
|||
# ── helpers ─────────────────────────────────────────────────────────────── |
|||
|
|||
def key_ip_reqs(ip): |
|||
return f"rl:ip:{ip}:reqs" |
|||
|
|||
def key_ip_blocked(ip): |
|||
return f"rl:ip:{ip}:blocked" |
|||
|
|||
def key_user_reqs(ns, uid): |
|||
return f"rl:{ns}:user:{uid}:reqs" |
|||
|
|||
def key_user_blocked(ns, uid): |
|||
return f"rl:{ns}:user:{uid}:blocked" |
|||
|
|||
def key_ns_window(ns, ip, bucket): |
|||
return f"rl:{ns}:ip:{ip}:w:{bucket}" |
|||
|
|||
|
|||
def write(r, key, value, ttl, label): |
|||
if DRY_RUN: |
|||
print(f" [dry-run] SET {key!r} = {value!r} EX {ttl} ({label})") |
|||
return |
|||
if isinstance(value, int): |
|||
pipe = r.pipeline() |
|||
pipe.set(key, value, ex=ttl) |
|||
pipe.execute() |
|||
else: |
|||
r.set(key, value, ex=ttl) |
|||
print(f" SET {key!r} = {value!r} EX {ttl}s ({label})") |
|||
|
|||
|
|||
def delete_pattern(r, pattern): |
|||
keys = r.keys(pattern) |
|||
if keys: |
|||
r.delete(*keys) |
|||
print(f" DEL {len(keys)} keys matching {pattern!r}") |
|||
else: |
|||
print(f" (no keys matching {pattern!r})") |
|||
|
|||
|
|||
# ── main ────────────────────────────────────────────────────────────────── |
|||
|
|||
def main(): |
|||
r = redis.from_url(REDIS_URL, db=RATELIMIT_DB, decode_responses=True) |
|||
try: |
|||
r.ping() |
|||
except redis.ConnectionError as exc: |
|||
print(f"ERROR: cannot connect to Redis at {REDIS_URL}: {exc}", file=sys.stderr) |
|||
sys.exit(1) |
|||
|
|||
print(f"Redis: {REDIS_URL} DB={RATELIMIT_DB} dry_run={DRY_RUN} clear={CLEAR}") |
|||
print() |
|||
|
|||
# ── clear mode ──────────────────────────────────────────────────────── |
|||
if CLEAR: |
|||
print("=== Clearing synthetic test keys ===") |
|||
for ip in TEST_IPS: |
|||
delete_pattern(r, f"rl:ip:{ip}:*") |
|||
delete_pattern(r, f"rl:{NAMESPACE}:ip:{ip}:*") |
|||
for u in TEST_USERS: |
|||
delete_pattern(r, f"rl:{NAMESPACE}:user:{u['uid']}:*") |
|||
print("Done.") |
|||
return |
|||
|
|||
# ── anonymous IP counters ───────────────────────────────────────────── |
|||
print("=== Anonymous IP request counters (DB1) ===") |
|||
write(r, key_ip_reqs(TEST_IPS[0]), 20, ANON_WINDOW, "below threshold") |
|||
write(r, key_ip_reqs(TEST_IPS[1]), 35, ANON_WINDOW, "AT threshold → middleware will block on next req") |
|||
write(r, key_ip_reqs(TEST_IPS[3]), 30, ANON_WINDOW, "below threshold") |
|||
print() |
|||
|
|||
# ── blocked IPs ─────────────────────────────────────────────────────── |
|||
print("=== Blocked IPs (DB1) ===") |
|||
write(r, key_ip_blocked(TEST_IPS[2]), "1", BLOCK_TTL, "hard-blocked") |
|||
print() |
|||
|
|||
# ── namespace/IP sliding window ─────────────────────────────────────── |
|||
print("=== Namespace/IP sliding window (DB1) ===") |
|||
bucket = int(time.time() // ANON_WINDOW) |
|||
write(r, key_ns_window(NAMESPACE, TEST_IPS[3], bucket), 34, ANON_WINDOW * 2, |
|||
"near window threshold (next req triggers ua_rotation block)") |
|||
print() |
|||
|
|||
# ── authenticated user counters ─────────────────────────────────────── |
|||
print("=== Authenticated user request counters (DB1) ===") |
|||
for u in TEST_USERS: |
|||
if not u["blocked"]: |
|||
write(r, key_user_reqs(NAMESPACE, u["uid"]), u["reqs"], AUTH_WINDOW, |
|||
f"uid={u['uid']} reqs={u['reqs']}") |
|||
print() |
|||
|
|||
# ── blocked users ───────────────────────────────────────────────────── |
|||
print("=== Blocked users (DB1) ===") |
|||
for u in TEST_USERS: |
|||
if u["blocked"]: |
|||
write(r, key_user_blocked(NAMESPACE, u["uid"]), "1", BLOCK_TTL, |
|||
f"uid={u['uid']} hard-blocked") |
|||
print() |
|||
|
|||
# ── summary ─────────────────────────────────────────────────────────── |
|||
if not DRY_RUN: |
|||
all_keys = r.keys("rl:*") |
|||
print(f"=== DB{RATELIMIT_DB} now contains {len(all_keys)} rl:* keys ===") |
|||
for k in sorted(all_keys): |
|||
ttl = r.ttl(k) |
|||
val = r.get(k) |
|||
print(f" {k!r:55s} val={val!r:5} ttl={ttl}s") |
|||
|
|||
|
|||
if __name__ == "__main__": |
|||
main() |
|||
File diff suppressed because it is too large
@ -0,0 +1,385 @@ |
|||
""" |
|||
Unit tests for sapl/middleware/ratelimit.py. |
|||
|
|||
No database access is needed — all tests use RequestFactory and mocks. |
|||
Redis is never contacted; _incr_with_ttl is either mocked directly on the |
|||
middleware instance or the fallback non-atomic path is exercised via the |
|||
mock cache. |
|||
""" |
|||
|
|||
import pytest |
|||
from unittest.mock import MagicMock, patch |
|||
from django.test import RequestFactory |
|||
|
|||
from sapl.middleware.ratelimit import ( |
|||
_NAMESPACE, |
|||
_is_suspicious_headers, |
|||
_parse_rate, |
|||
get_client_ip, |
|||
make_ratelimit_cache_key, |
|||
RateLimitMiddleware, |
|||
RL_IP_BLOCKED, |
|||
RL_USER_BLOCKED, |
|||
smart_key, |
|||
smart_rate, |
|||
) |
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Shared test helpers |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
_factory = RequestFactory() |
|||
|
|||
# Headers that a normal browser would send — used as the default baseline. |
|||
_NORMAL_HEADERS = { |
|||
'HTTP_ACCEPT': 'text/html,application/xhtml+xml', |
|||
'HTTP_ACCEPT_LANGUAGE': 'pt-BR,pt;q=0.9', |
|||
} |
|||
|
|||
|
|||
def _req(ip='1.2.3.4', ua='Mozilla/5.0', path='/', extra_meta=None): |
|||
"""GET request with sensible defaults and browser-like headers.""" |
|||
request = _factory.get(path) |
|||
request.META.update({'REMOTE_ADDR': ip, 'HTTP_USER_AGENT': ua, **_NORMAL_HEADERS}) |
|||
if extra_meta: |
|||
request.META.update(extra_meta) |
|||
return request |
|||
|
|||
|
|||
def _anon_req(**kwargs): |
|||
r = _req(**kwargs) |
|||
r.user = MagicMock(is_authenticated=False) |
|||
return r |
|||
|
|||
|
|||
def _auth_req(uid=7, **kwargs): |
|||
r = _req(**kwargs) |
|||
r.user = MagicMock(is_authenticated=True, pk=uid) |
|||
return r |
|||
|
|||
|
|||
def _make_middleware(whitelist=None, anon_rate='35/m', auth_rate='120/m'): |
|||
""" |
|||
Return (middleware, mock_cache). |
|||
|
|||
The ratelimit cache is replaced with a MagicMock whose .get() returns None |
|||
by default (nothing blocked, no counters set). Tests may replace |
|||
mock_cache.get.side_effect or mock mw._incr_with_ttl directly. |
|||
|
|||
sapl.middleware.ratelimit imports settings as `from sapl import settings` |
|||
(a direct module reference), so django.test.override_settings has no effect |
|||
on it. We patch the name in the ratelimit module's namespace instead. |
|||
""" |
|||
mock_cache = MagicMock() |
|||
mock_cache.get.return_value = None |
|||
get_response = MagicMock(return_value=MagicMock(status_code=200)) |
|||
|
|||
mock_settings = MagicMock() |
|||
mock_settings.RATE_LIMITER_RATE = anon_rate |
|||
mock_settings.RATE_LIMITER_RATE_AUTHENTICATED = auth_rate |
|||
mock_settings.RATE_LIMITER_RATE_BOT = '5/m' |
|||
mock_settings.RATE_LIMIT_WHITELIST_IPS = whitelist or [] |
|||
mock_settings.POD_NAMESPACE = _NAMESPACE # keep module-level _NAMESPACE consistent |
|||
|
|||
with ( |
|||
patch('sapl.middleware.ratelimit.caches') as mock_caches, |
|||
patch('sapl.middleware.ratelimit.settings', mock_settings), |
|||
): |
|||
mock_caches.__getitem__.return_value = mock_cache |
|||
mw = RateLimitMiddleware(get_response) |
|||
# __init__ already set mw._rl_cache = caches['ratelimit'] == mock_cache, |
|||
# but reassign explicitly so tests have a direct handle on the same object. |
|||
mw._rl_cache = mock_cache |
|||
return mw, mock_cache |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# _parse_rate |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
@pytest.mark.parametrize('rate_str,expected', [ |
|||
('35/m', (35, 60)), |
|||
('120/m', (120, 60)), |
|||
('10/s', (10, 1)), |
|||
('5/h', (5, 3600)), |
|||
('1/M', (1, 60)), # period is case-insensitive |
|||
]) |
|||
def test_parse_rate(rate_str, expected): |
|||
assert _parse_rate(rate_str) == expected |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# make_ratelimit_cache_key — pass-through, no mangling |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_make_ratelimit_cache_key_passthrough(): |
|||
assert make_ratelimit_cache_key('rl:ip:1.2.3.4:reqs', 'some_prefix', 1) == 'rl:ip:1.2.3.4:reqs' |
|||
assert make_ratelimit_cache_key('rl:abc123', '', 99) == 'rl:abc123' |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# _is_suspicious_headers |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_suspicious_both_headers_missing(): |
|||
r = _factory.get('/') |
|||
r.META.pop('HTTP_ACCEPT', None) |
|||
r.META.pop('HTTP_ACCEPT_LANGUAGE', None) |
|||
assert _is_suspicious_headers(r) is True |
|||
|
|||
|
|||
def test_suspicious_one_header_missing_is_not_suspicious(): |
|||
"""Only flagged when *both* headers are absent.""" |
|||
r = _factory.get('/') |
|||
r.META['HTTP_ACCEPT'] = 'text/html' |
|||
r.META.pop('HTTP_ACCEPT_LANGUAGE', None) |
|||
assert _is_suspicious_headers(r) is False |
|||
|
|||
|
|||
def test_suspicious_both_headers_present(): |
|||
r = _factory.get('/') |
|||
r.META['HTTP_ACCEPT'] = 'text/html' |
|||
r.META['HTTP_ACCEPT_LANGUAGE'] = 'pt-BR' |
|||
assert _is_suspicious_headers(r) is False |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# get_client_ip — header priority and XFF chain |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_get_client_ip_remote_addr(): |
|||
r = _factory.get('/') |
|||
r.META['REMOTE_ADDR'] = '10.0.0.1' |
|||
assert get_client_ip(r) == '10.0.0.1' |
|||
|
|||
|
|||
def test_get_client_ip_xff_single(): |
|||
r = _factory.get('/') |
|||
r.META['HTTP_X_FORWARDED_FOR'] = '203.0.113.5' |
|||
assert get_client_ip(r) == '203.0.113.5' |
|||
|
|||
|
|||
def test_get_client_ip_xff_chain_uses_leftmost(): |
|||
"""The leftmost IP in XFF is the real client; the rest are proxies.""" |
|||
r = _factory.get('/') |
|||
r.META['HTTP_X_FORWARDED_FOR'] = '203.0.113.5, 10.0.0.1, 10.0.0.2' |
|||
assert get_client_ip(r) == '203.0.113.5' |
|||
|
|||
|
|||
def test_get_client_ip_x_real_ip_used_when_no_xff(): |
|||
r = _factory.get('/') |
|||
r.META['REMOTE_ADDR'] = '127.0.0.1' |
|||
r.META['HTTP_X_REAL_IP'] = '203.0.113.9' |
|||
assert get_client_ip(r) == '203.0.113.9' |
|||
|
|||
|
|||
def test_get_client_ip_xff_preferred_over_x_real_ip(): |
|||
r = _factory.get('/') |
|||
r.META['HTTP_X_FORWARDED_FOR'] = '203.0.113.1' |
|||
r.META['HTTP_X_REAL_IP'] = '203.0.113.2' |
|||
assert get_client_ip(r) == '203.0.113.1' |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# smart_key / smart_rate |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_smart_key_anon_returns_masked_ip(): |
|||
r = _anon_req(ip='5.5.5.5') |
|||
assert smart_key(None, r) == '5.5.5.5' |
|||
|
|||
|
|||
def test_smart_key_auth_returns_pk_string(): |
|||
r = _auth_req(uid=42, ip='5.5.5.5') |
|||
assert smart_key(None, r) == '42' |
|||
|
|||
|
|||
def test_smart_rate_anon_returns_anon_rate(): |
|||
with patch('sapl.middleware.ratelimit.settings') as mock_s: |
|||
mock_s.RATE_LIMITER_RATE = '35/m' |
|||
mock_s.RATE_LIMITER_RATE_AUTHENTICATED = '120/m' |
|||
assert smart_rate(None, _anon_req()) == '35/m' |
|||
|
|||
|
|||
def test_smart_rate_auth_returns_auth_rate(): |
|||
with patch('sapl.middleware.ratelimit.settings') as mock_s: |
|||
mock_s.RATE_LIMITER_RATE = '35/m' |
|||
mock_s.RATE_LIMITER_RATE_AUTHENTICATED = '120/m' |
|||
assert smart_rate(None, _auth_req()) == '120/m' |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# RateLimitMiddleware — whitelisted IP bypasses everything (including bad UA) |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_whitelist_bypasses_all_checks(): |
|||
mw, mock_cache = _make_middleware(whitelist=['1.2.3.4']) |
|||
result = mw._evaluate(_anon_req(ip='1.2.3.4', ua='GPTBot/1.0')) |
|||
assert result == {'action': 'pass', 'ip': '1.2.3.4'} |
|||
mock_cache.get.assert_not_called() |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 1 — known bot User-Agent |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
@pytest.mark.parametrize('ua', [ |
|||
'GPTBot/1.0', |
|||
'Mozilla/5.0 (compatible; ClaudeBot/1.0)', |
|||
'PerplexityBot', |
|||
'Bytespider', |
|||
'AhrefsBot/7.0', |
|||
'meta-externalagent/1.1', |
|||
'OAI-SearchBot', |
|||
'Mozilla/5.0 (compatible; bingbot/2.0)', |
|||
'SERankingBacklinksBot/1.0', |
|||
'Mozilla/5.0 AppleWebKit Chrome/98.0.4758.80', |
|||
]) |
|||
def test_known_bot_ua_blocked(ua): |
|||
mw, _ = _make_middleware() |
|||
result = mw._evaluate(_anon_req(ua=ua)) |
|||
assert result == {'action': 'block', 'reason': 'known_ua', 'ip': '1.2.3.4'} |
|||
|
|||
|
|||
def test_bot_ua_check_is_case_insensitive(): |
|||
mw, _ = _make_middleware() |
|||
result = mw._evaluate(_anon_req(ua='gptbot/2.0')) |
|||
assert result['reason'] == 'known_ua' |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 2 — IP already blocked in cache |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_ip_blocked_in_cache(): |
|||
mw, mock_cache = _make_middleware() |
|||
ip = '1.2.3.4' |
|||
mock_cache.get.side_effect = lambda key: 1 if key == RL_IP_BLOCKED.format(ip=ip) else None |
|||
result = mw._evaluate(_anon_req(ip=ip)) |
|||
assert result == {'action': 'block', 'reason': 'ip_blocked', 'ip': ip} |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 3a — authenticated user blocked in cache |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_auth_user_blocked_in_cache(): |
|||
mw, mock_cache = _make_middleware() |
|||
uid = '7' |
|||
mock_cache.get.side_effect = lambda key: ( |
|||
1 if key == RL_USER_BLOCKED.format(ns=_NAMESPACE, uid=uid) else None |
|||
) |
|||
result = mw._evaluate(_auth_req(uid=int(uid))) |
|||
assert result == {'action': 'block', 'reason': 'user_blocked', 'ip': '1.2.3.4'} |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 3b — authenticated + suspicious headers |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_auth_suspicious_headers_blocked(): |
|||
mw, _ = _make_middleware() |
|||
r = _auth_req() |
|||
r.META.pop('HTTP_ACCEPT', None) |
|||
r.META.pop('HTTP_ACCEPT_LANGUAGE', None) |
|||
result = mw._evaluate(r) |
|||
assert result == {'action': 'block', 'reason': 'suspicious_headers_auth', 'ip': '1.2.3.4'} |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 3c — authenticated request rate |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_auth_rate_exceeded_blocks_and_marks_user_blocked(): |
|||
mw, mock_cache = _make_middleware(auth_rate='5/m') |
|||
mw._incr_with_ttl = MagicMock(return_value=5) # exactly at threshold |
|||
result = mw._evaluate(_auth_req(uid=7)) |
|||
assert result == {'action': 'block', 'reason': 'auth_user_rate', 'ip': '1.2.3.4'} |
|||
mock_cache.set.assert_called_once_with( |
|||
RL_USER_BLOCKED.format(ns=_NAMESPACE, uid='7'), |
|||
1, |
|||
timeout=RateLimitMiddleware.BLOCK_TTL, |
|||
) |
|||
|
|||
|
|||
def test_auth_under_rate_passes(): |
|||
mw, mock_cache = _make_middleware(auth_rate='5/m') |
|||
mw._incr_with_ttl = MagicMock(return_value=4) # one below threshold |
|||
result = mw._evaluate(_auth_req(uid=7)) |
|||
assert result == {'action': 'pass', 'ip': '1.2.3.4'} |
|||
mock_cache.set.assert_not_called() |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 4a — anonymous + suspicious headers |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_anon_suspicious_headers_blocked(): |
|||
mw, _ = _make_middleware() |
|||
r = _anon_req() |
|||
r.META.pop('HTTP_ACCEPT', None) |
|||
r.META.pop('HTTP_ACCEPT_LANGUAGE', None) |
|||
result = mw._evaluate(r) |
|||
assert result == {'action': 'block', 'reason': 'suspicious_headers', 'ip': '1.2.3.4'} |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 4b — anonymous IP request rate |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_anon_ip_rate_exceeded_blocks_and_marks_ip_blocked(): |
|||
mw, mock_cache = _make_middleware(anon_rate='5/m') |
|||
mw._incr_with_ttl = MagicMock(return_value=5) # first call (IP counter) hits threshold |
|||
result = mw._evaluate(_anon_req()) |
|||
assert result == {'action': 'block', 'reason': 'ip_rate', 'ip': '1.2.3.4'} |
|||
mock_cache.set.assert_called_once_with( |
|||
RL_IP_BLOCKED.format(ip='1.2.3.4'), |
|||
1, |
|||
timeout=RateLimitMiddleware.BLOCK_TTL, |
|||
) |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# Check 4c — per-namespace/IP/window (UA rotation detection) |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_anon_ua_rotation_detected_blocks_and_marks_ip_blocked(): |
|||
mw, mock_cache = _make_middleware(anon_rate='5/m') |
|||
# First call (IP counter) is under threshold; second (window counter) hits it. |
|||
mw._incr_with_ttl = MagicMock(side_effect=[4, 5]) |
|||
result = mw._evaluate(_anon_req()) |
|||
assert result == {'action': 'block', 'reason': 'ua_rotation', 'ip': '1.2.3.4'} |
|||
mock_cache.set.assert_called_once_with( |
|||
RL_IP_BLOCKED.format(ip='1.2.3.4'), |
|||
1, |
|||
timeout=RateLimitMiddleware.BLOCK_TTL, |
|||
) |
|||
|
|||
|
|||
def test_anon_under_all_thresholds_passes(): |
|||
mw, mock_cache = _make_middleware(anon_rate='5/m') |
|||
mw._incr_with_ttl = MagicMock(return_value=4) # both counters below threshold |
|||
result = mw._evaluate(_anon_req()) |
|||
assert result == {'action': 'pass', 'ip': '1.2.3.4'} |
|||
mock_cache.set.assert_not_called() |
|||
|
|||
|
|||
# --------------------------------------------------------------------------- |
|||
# __call__ — block returns 429, pass forwards to get_response |
|||
# --------------------------------------------------------------------------- |
|||
|
|||
def test_call_block_returns_429_with_retry_after_header(): |
|||
mw, _ = _make_middleware() |
|||
mw._evaluate = MagicMock(return_value={'action': 'block', 'reason': 'known_ua', 'ip': '1.2.3.4'}) |
|||
response = mw(_factory.get('/')) |
|||
assert response.status_code == 429 |
|||
assert response['Retry-After'] == str(RateLimitMiddleware.BLOCK_TTL) |
|||
mw.get_response.assert_not_called() |
|||
|
|||
|
|||
def test_call_pass_forwards_request_to_get_response(): |
|||
mw, _ = _make_middleware() |
|||
mw._evaluate = MagicMock(return_value={'action': 'pass', 'ip': '1.2.3.4'}) |
|||
request = _anon_req() |
|||
mw(request) |
|||
mw.get_response.assert_called_once_with(request) |
|||
@ -0,0 +1,83 @@ |
|||
#!/usr/bin/env python3 |
|||
""" |
|||
Script to test rate limiting of an endpoint. |
|||
""" |
|||
|
|||
import argparse |
|||
import time |
|||
import requests |
|||
from collections import defaultdict |
|||
from urllib.parse import urlparse |
|||
|
|||
|
|||
def test_rate_limiter(url, num_requests=50, delay=0.1, timeout=10): |
|||
"""Send multiple requests and analyze rate limiting behavior.""" |
|||
parsed = urlparse(url) |
|||
if not parsed.scheme or not parsed.netloc: |
|||
raise ValueError( |
|||
"URL must include a protocol and host, e.g. http://localhost or https://example.com" |
|||
) |
|||
if parsed.scheme not in {"http", "https"}: |
|||
raise ValueError("Unsupported URL scheme: %s. Use http or https." % parsed.scheme) |
|||
|
|||
status_counts = defaultdict(int) |
|||
response_times = [] |
|||
first_rate_limited_at = None |
|||
attempted_requests = 0 |
|||
|
|||
print(f"Testing rate limiter on: {url}") |
|||
print(f"Number of requests: {num_requests}") |
|||
print(f"Delay between requests: {delay}s") |
|||
print("-" * 50) |
|||
|
|||
for i in range(num_requests): |
|||
attempted_requests += 1 |
|||
try: |
|||
start_time = time.time() |
|||
response = requests.get(url, timeout=timeout) |
|||
elapsed = time.time() - start_time |
|||
|
|||
status_counts[response.status_code] += 1 |
|||
response_times.append(elapsed) |
|||
|
|||
print(f"Request {i+1:3d}: Status {response.status_code} | Time: {elapsed:.3f}s") |
|||
|
|||
if response.status_code == 429: |
|||
if first_rate_limited_at is None: |
|||
first_rate_limited_at = i + 1 |
|||
print(f" -> Rate limited on request {i+1}") |
|||
break |
|||
|
|||
except requests.exceptions.RequestException as e: |
|||
print(f"Request {i+1:3d}: Error - {e}") |
|||
status_counts['ERROR'] += 1 |
|||
|
|||
if i < num_requests - 1: |
|||
time.sleep(delay) |
|||
|
|||
print("-" * 50) |
|||
print("\nSummary:") |
|||
print(f" Total requests attempted: {attempted_requests}") |
|||
print(f" Successful (200): {status_counts.get(200, 0)}") |
|||
print(f" Rate limited (429): {status_counts.get(429, 0)}") |
|||
if first_rate_limited_at is not None: |
|||
print(f" First 429 occurred at request: {first_rate_limited_at}") |
|||
print(f" Other errors: {sum(v for k, v in status_counts.items() if k not in [200, 429, 'ERROR'])}") |
|||
|
|||
if response_times: |
|||
avg_time = sum(response_times) / len(response_times) |
|||
print(f"\nAverage response time: {avg_time:.3f}s") |
|||
|
|||
|
|||
if __name__ == "__main__": |
|||
parser = argparse.ArgumentParser(description="Test rate limiter of a URL") |
|||
parser.add_argument( |
|||
"url", |
|||
help="URL to test, including protocol (http:// or https://)", |
|||
) |
|||
parser.add_argument("-n", "--num-requests", type=int, default=50, help="Number of requests") |
|||
parser.add_argument("-d", "--delay", type=float, default=0.1, help="Delay between requests (seconds)") |
|||
parser.add_argument("-t", "--timeout", type=int, default=10, help="Request timeout (seconds)") |
|||
|
|||
args = parser.parse_args() |
|||
test_rate_limiter(args.url, args.num_requests, args.delay, args.timeout) |
|||
@ -1,14 +0,0 @@ |
|||
#!/bin/bash |
|||
|
|||
#URL=http://localhost:8000/materia/4379 |
|||
#URL=http://localhost:8000/norma/pesquisar |
|||
#URL=http://localhost/norma/pesquisar |
|||
#URL=https://sapl31demo.interlegis.leg.br/docadm/45 |
|||
#URL=https://sapl.joaopessoa.pb.leg.br/materia/186300 |
|||
#URL=http://localhost:8000/materia/4379/materiaassunto |
|||
#URL=http://localhost:8000/sessao/4984 |
|||
URL="http://localhost:8000/docadm/pesq-doc-adm?tipo=&o=&numero=&complemento=&ano=&protocolo__numero=&numero_externo=&data_0=&data_1=&interessado=&assunto=&tramitacao=&tramitacaoadministrativo__status=&tramitacaoadministrativo__unidade_tramitacao_destino=&pesquisar=Pesquisar" |
|||
|
|||
for i in $(seq 1 12); do |
|||
curl -sS -o /dev/null -w "req=$i http=%{http_code} time=%{time_total}\n" "$URL" |
|||
done |
|||
Loading…
Reference in new issue