fastapi rate limiting errors guide

You’ve wired up SlowAPI, deployed to production, and discovered your rate limiter either blocks every user on the first request or does nothing at all. FastAPI rate limiting has a handful of sharp edges that are easy to miss — and most of them only surface in production.

TLDR: The most common causes are wrong IP detection behind a proxy, in-memory counters not shared across Uvicorn workers, a missing request: Request parameter in your route, incorrect decorator order, and a 429 response that doesn’t match your API’s error schema. Each has a straightforward fix.

How SlowAPI Works with FastAPI

SlowAPI is a port of Flask-Limiter adapted for ASGI frameworks. It hooks into FastAPI’s request lifecycle through two mechanisms: a Limiter object attached to app.state, and an exception handler for RateLimitExceeded.

The minimal setup looks like this:

from fastapi import FastAPI, Request
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

limiter = Limiter(key_func=get_remote_address)
app = FastAPI()
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.get("/items")
@limiter.limit("10/minute")
async def list_items(request: Request):
    return {"items": []}

Three pieces are required: the app.state.limiter assignment, the exception handler registration, and the request: Request parameter in every rate-limited route. Drop any one of them and the limiter silently does nothing — no error, no warning, just no enforcement.

SlowAPI stores counters using a pluggable storage backend. By default it uses in-memory storage backed by a dict inside the Limiter instance. That’s fine for single-process development but breaks the moment you run more than one worker.

Common Pitfalls

Wrong IP detection behind a reverse proxy

This is the most frequent production failure. When FastAPI runs behind Nginx, AWS ALB, or any load balancer, request.client.host — which is what get_remote_address reads — is the proxy’s IP, not your user’s IP. Every single user appears to come from the same address, so they share one limit pool and get blocked together on the first burst.

The fix is reading the X-Forwarded-For header:

from fastapi import Request
from slowapi.util import get_remote_address

def get_client_ip(request: Request) -> str:
    forwarded = request.headers.get("X-Forwarded-For")
    if forwarded:
        # First address in the chain is the original client
        return forwarded.split(",")[0].strip()
    return get_remote_address(request)

limiter = Limiter(key_func=get_client_ip)

One caveat: only trust X-Forwarded-For when it’s set by infrastructure you control. A user can send that header themselves and spoof any IP address. If you’re on AWS, your ALB sets it correctly and strips client-supplied values. On bare Nginx, configure proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; explicitly.

For APIs with authentication, keying by user ID is even more reliable than IP because it survives NAT, shared offices, and mobile network switches:

def get_rate_limit_key(request: Request) -> str:
    user_id = getattr(request.state, "user_id", None)
    if user_id:
        return f"user:{user_id}"
    # Unauthenticated requests fall back to IP
    forwarded = request.headers.get("X-Forwarded-For", "")
    if forwarded:
        return forwarded.split(",")[0].strip()
    return get_remote_address(request)

Missing `request: Request` parameter

SlowAPI intercepts the incoming request to extract the key and check the counter. It does this by inspecting the route function’s signature at decoration time. If request: Request isn’t in the parameter list, SlowAPI can’t find the request object and silently skips enforcement.

# ❌ Rate limiting does nothing — request parameter is missing
@app.get("/data")
@limiter.limit("10/minute")
async def get_data(item_id: int):
    return {"id": item_id}

# ✅ request must appear in the signature
@app.get("/data")
@limiter.limit("10/minute")
async def get_data(request: Request, item_id: int):
    return {"id": item_id}

FastAPI won’t raise an error because Request is a valid FastAPI parameter that gets injected automatically. The issue is invisible until you test enforcement and notice nothing is actually being counted.

Decorator order matters

The @limiter.limit() decorator must be the inner decorator — directly above the function definition. If you swap the order, SlowAPI decorates the unwrapped function before FastAPI registers the route, and the limit never fires.

# ❌ Wrong — limiter decorates before the route is registered
@limiter.limit("5/minute")
@app.get("/search")
async def search(request: Request, q: str):
    return {"results": []}

# ✅ Correct — app route decorator is outermost
@app.get("/search")
@limiter.limit("5/minute")
async def search(request: Request, q: str):
    return {"results": []}

This is a subtle Python decorator ordering issue. Decorators apply bottom-up, so @app.get wraps the already-limited function. Reversed, SlowAPI wraps a plain coroutine before FastAPI knows it’s a route at all.

In-memory counters don’t survive across workers

The default in-memory storage stores counters as a Python dict inside the Limiter object. Each Uvicorn worker process has its own memory space and its own copy of that dict. With --workers 4, a user can make 4× your intended limit before any single worker blocks them.

Switch to Redis for shared state:

# pip install slowapi redis
limiter = Limiter(
    key_func=get_client_ip,
    storage_uri="redis://localhost:6379/0"
)

That’s the only change needed. SlowAPI handles the Redis connection and key management internally. All workers read and write the same counters.

Check the connection at startup so a Redis outage doesn’t silently degrade to in-memory mode:

import redis as redis_lib

@app.on_event("startup")
async def verify_redis():
    try:
        r = redis_lib.from_url("redis://localhost:6379/0")
        r.ping()
    except redis_lib.ConnectionError:
        raise RuntimeError("Redis unavailable — rate limiting will not work correctly")

If you can’t run Redis in your environment, limits (SlowAPI’s underlying library) also supports Memcached with memcached://localhost:11211.

429 response format mismatch

SlowAPI’s built-in _rate_limit_exceeded_handler returns a plain text "Rate limit exceeded: 10 per 1 minute" string. If your API returns structured JSON errors, this inconsistency confuses clients and breaks error handling code that expects a specific schema.

Replace the default handler with a custom one:

from fastapi import Request
from fastapi.responses import JSONResponse
from slowapi.errors import RateLimitExceeded

async def rate_limit_exceeded_handler(
    request: Request, exc: RateLimitExceeded
) -> JSONResponse:
    return JSONResponse(
        status_code=429,
        content={
            "error": "rate_limit_exceeded",
            "message": f"Too many requests. Limit: {exc.limit.limit}.",
            "retry_after": exc.retry_after,
        },
        headers={"Retry-After": str(exc.retry_after)},
    )

app.add_exception_handler(RateLimitExceeded, rate_limit_exceeded_handler)

Always include the Retry-After header. It’s part of the HTTP spec for 429 responses, and smart API clients use it to implement automatic backoff without any extra configuration on your end.

Real-World Examples

Production API with global and per-route limits

Most APIs need two layers: a global limit that prevents abuse of any endpoint, and tighter limits on expensive operations.

from fastapi import FastAPI, Request
from slowapi import Limiter
from slowapi.middleware import SlowAPIMiddleware
from slowapi.errors import RateLimitExceeded
from fastapi.responses import JSONResponse

def get_client_ip(request: Request) -> str:
    forwarded = request.headers.get("X-Forwarded-For", "")
    return forwarded.split(",")[0].strip() if forwarded else request.client.host

limiter = Limiter(
    key_func=get_client_ip,
    default_limits=["200/minute"],        # Applied to every route
    storage_uri="redis://localhost:6379/0"
)

app = FastAPI()
app.state.limiter = limiter
app.add_middleware(SlowAPIMiddleware)     # Enforces default_limits globally

async def custom_429(request: Request, exc: RateLimitExceeded) -> JSONResponse:
    return JSONResponse(
        status_code=429,
        content={"error": "rate_limit_exceeded", "retry_after": exc.retry_after},
        headers={"Retry-After": str(exc.retry_after)},
    )

app.add_exception_handler(RateLimitExceeded, custom_429)

# This route gets the global 200/minute limit automatically
@app.get("/products")
async def list_products(request: Request):
    return {"products": []}

# This route gets BOTH the global limit AND a stricter per-route limit
@app.post("/export")
@limiter.limit("5/minute")
async def export_data(request: Request):
    return {"status": "queued"}

The SlowAPIMiddleware enforces default_limits without requiring a decorator on every route. Individual @limiter.limit() decorators add stricter limits on top of the global one — both must pass for the request to proceed.

Testing rate limiting logic

FastAPI’s TestClient runs the app in-process, which means rate limit state persists across requests within a test. That’s what you want for testing enforcement:

from fastapi.testclient import TestClient
import pytest

def test_rate_limit_enforced():
    client = TestClient(app)

    # First 5 requests should succeed
    for _ in range(5):
        response = client.get("/export", headers={"X-Forwarded-For": "1.2.3.4"})
        assert response.status_code == 200

    # 6th request should be blocked
    response = client.get("/export", headers={"X-Forwarded-For": "1.2.3.4"})
    assert response.status_code == 429
    assert "retry_after" in response.json()
    assert response.headers.get("Retry-After") is not None

def test_different_ips_not_affected():
    client = TestClient(app)

    # Exhaust limit for one IP
    for _ in range(5):
        client.get("/export", headers={"X-Forwarded-For": "1.2.3.4"})

    # Different IP should still work
    response = client.get("/export", headers={"X-Forwarded-For": "5.6.7.8"})
    assert response.status_code == 200

For tests that need Redis-backed storage to behave realistically, use fakeredis:

# pip install fakeredis
import fakeredis
from limits.storage import RedisStorage

# Patch in tests
@pytest.fixture(autouse=True)
def reset_limiter():
    # Clear all counters between tests
    app.state.limiter._storage.clear()
    yield

Advanced Tips

Dynamic rate limits by user tier: Rate limits don’t have to be static strings. SlowAPI accepts callables:

def get_limit_for_user(request: Request) -> str:
    plan = getattr(request.state, "plan", "free")
    limits = {
        "free": "30/minute",
        "pro": "200/minute",
        "enterprise": "1000/minute",
    }
    return limits.get(plan, "30/minute")

@app.get("/api/data")
@limiter.limit(get_limit_for_user)
async def get_data(request: Request):
    ...

Set request.state.plan in your authentication middleware before the route handler runs.

Exempting internal routes: Health checks and metrics endpoints shouldn’t count against rate limits. Use @limiter.exempt:

@app.get("/health")
@limiter.exempt
async def health_check():
    return {"status": "ok"}

Rate limiting with FastAPI dependencies: If you use FastAPI’s dependency injection for authentication, the request object is still available. Just inject it explicitly:

from fastapi import Depends

async def get_current_user(request: Request):
    # Your auth logic here
    return {"user_id": "123"}

@app.get("/profile")
@limiter.limit("20/minute")
async def get_profile(
    request: Request,
    current_user: dict = Depends(get_current_user),
):
    return current_user

The request parameter must remain in the function signature even when using dependencies — FastAPI won’t add it automatically just because you’re using Depends.

For deeper issues with async route behavior and concurrency, see FastAPI async/sync blocking debugging and for dependency injection errors that can interact with middleware, check out FastAPI dependency injection errors guide.

Key Takeaways

Behind a proxy? Replace get_remote_address with a function that reads X-Forwarded-For
Multiple workers? Switch to storage_uri="redis://..." — in-memory counters aren’t shared
Nothing enforcing? Check for request: Request in the route signature and verify decorator order (@app.route outermost, @limiter.limit innermost)
Inconsistent 429s? Replace _rate_limit_exceeded_handler with a custom handler returning your API’s error schema plus a Retry-After header
Global limits? Use default_limits in the Limiter constructor and add SlowAPIMiddleware — no decorator needed on every route

Rate limiting in FastAPI is reliable once you’ve addressed these failure modes. The proxy IP issue trips up almost everyone in their first production deployment; the multi-worker counter issue shows up shortly after.

Use Debugly’s trace formatter to quickly parse and analyze Python tracebacks from SlowAPI and FastAPI error logs — paste a stack trace and get a clean, readable breakdown instantly.