fastapi websocket connection drops fix

WebSocket connections in FastAPI drop unexpectedly more often than you’d think — and the error message rarely tells you why.

TLDR: Common Reasons FastAPI WebSockets Drop

Unhandled WebSocketDisconnect — the exception is raised when the client leaves, but if you don’t catch it your handler crashes silently
Blocking synchronous code in an async WebSocket handler — freezes the event loop and starves all other connections
No keepalive / ping-pong — load balancers and proxies kill idle connections after 30–60 seconds
Message size exceeding Starlette’s limit — causes an abrupt disconnect with no helpful error
Shared mutable state in broadcast loops — raises RuntimeError or silently drops messages when iterating over a list that’s being modified

Diagnostic tip: Add print or logging.exception inside a bare except block right around your websocket.receive_*() call. If connections die silently, that’s the first place to look.

Diagnosing the Drop

Before diving into fixes, you need to know which cause you’re dealing with. Here’s a quick checklist:

Is the drop immediate (on connect)? → likely a missing or malformed handshake, or an unhandled exception thrown before the first await websocket.accept().
Does it drop after ~30–60 seconds of silence? → almost certainly a keepalive/proxy timeout issue.
Does it drop only under load? → event loop blocking or shared-state corruption in your broadcast logic.
Does it drop after sending a large message? → message size limit.
Does the client-side console show code 1006 (Abnormal Closure)? → the server closed the connection without a proper close frame, usually because an unhandled exception terminated the handler.

Enable Uvicorn’s access log and set your log level to DEBUG during development:

uvicorn app.main:app --reload --log-level debug

You’ll see lines like WebSocket /ws 101 Switching Protocols on connect and the disconnect reason code on close. That alone narrows things down significantly.

Cause #1: Unhandled `WebSocketDisconnect`

This is the single most common mistake. When a client closes their browser tab, refreshes, or loses connectivity, Starlette raises starlette.websockets.WebSocketDisconnect inside your receive call. If you don’t catch it, the exception propagates up, your handler exits with a traceback, and the connection is dropped abruptly.

The broken pattern:

# This code triggers the error:
from fastapi import FastAPI, WebSocket

app = FastAPI()

@app.websocket("/ws")
async def websocket_endpoint(ws: WebSocket):
    await ws.accept()
    while True:
        data = await ws.receive_text()  # raises WebSocketDisconnect on client close
        await ws.send_text(f"echo: {data}")

If the client disconnects, you’ll see this in your logs:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  ...
starlette.websockets.WebSocketDisconnect: code=1001

And every client that was connected at that moment might also be affected if you’re running a shared broadcast loop — because the uncaught exception unwinds the entire handler.

The fix:

from fastapi import FastAPI, WebSocket
from starlette.websockets import WebSocketDisconnect
import logging

logger = logging.getLogger(__name__)
app = FastAPI()

@app.websocket("/ws")
async def websocket_endpoint(ws: WebSocket):
    await ws.accept()
    try:
        while True:
            data = await ws.receive_text()
            await ws.send_text(f"echo: {data}")
    except WebSocketDisconnect as e:
        logger.info("Client disconnected: code=%s", e.code)
        # Clean up any per-connection state here
    except Exception:
        logger.exception("Unexpected error in WebSocket handler")
        await ws.close(code=1011)  # Internal error close code

The key detail: WebSocketDisconnect is not a subclass of Exception in older versions of Starlette — always import and catch it explicitly rather than relying on a bare except Exception.

Cause #2: Blocking Synchronous Code in the Handler

FastAPI runs on an async event loop. If your WebSocket handler calls blocking I/O — a synchronous database query, a time.sleep(), a requests.get(), or a heavy CPU computation — the entire event loop freezes for the duration of that call. Other WebSocket connections can’t send or receive data, and eventually their clients time out and disconnect.

This problem is especially insidious because it only shows up under load. During development with one test client it seems fine.

# Problem: synchronous DB call blocks the event loop
import time
from fastapi import FastAPI, WebSocket

app = FastAPI()

def get_user_from_db(user_id: str):
    time.sleep(0.5)  # Simulates a slow sync query
    return {"id": user_id, "name": "Alice"}

@app.websocket("/ws/{user_id}")
async def ws(websocket: WebSocket, user_id: str):
    await websocket.accept()
    # This blocks the entire event loop for 500ms on every message!
    user = get_user_from_db(user_id)
    await websocket.send_json(user)

Fix — use run_in_executor for sync blocking calls:

import asyncio
from fastapi import FastAPI, WebSocket

app = FastAPI()

def get_user_from_db(user_id: str):
    # Pretend this is a real sync database call
    return {"id": user_id, "name": "Alice"}

@app.websocket("/ws/{user_id}")
async def ws(websocket: WebSocket, user_id: str):
    await websocket.accept()
    loop = asyncio.get_event_loop()
    # Offloads the blocking call to a thread pool — event loop stays free
    user = await loop.run_in_executor(None, get_user_from_db, user_id)
    await websocket.send_json(user)

Better fix — switch to an async database library:

If you’re using SQLAlchemy, migrate your WebSocket handlers to use AsyncSession with async with syntax. If you’re using raw queries, switch to asyncpg or databases. For HTTP calls inside WebSocket handlers, use httpx.AsyncClient instead of requests.

Mixing sync and async is one of the most common sources of subtle bugs in FastAPI — if you want a deeper dive on the async/sync topic more broadly, check out our post on fixing async/sync blocking issues in FastAPI.

Cause #3: Missing Keepalive / Ping-Pong

The WebSocket spec includes a built-in ping-pong mechanism for exactly this purpose: keeping idle connections alive through NAT gateways, load balancers (AWS ALB, NGINX), and reverse proxies that close connections after a configurable idle timeout.

Most managed infrastructure has an idle timeout somewhere between 30 and 300 seconds. If your WebSocket connection is open but no messages are flowing, the connection silently dies — and your code never finds out until it tries to send.

FastAPI / Starlette doesn’t enable ping-pong by default. You need to implement it yourself in your application code:

import asyncio
from fastapi import FastAPI, WebSocket
from starlette.websockets import WebSocketDisconnect

app = FastAPI()

PING_INTERVAL = 20  # seconds — must be less than your proxy's idle timeout

@app.websocket("/ws")
async def ws_with_keepalive(websocket: WebSocket):
    await websocket.accept()

    async def send_pings():
        try:
            while True:
                await asyncio.sleep(PING_INTERVAL)
                await websocket.send_text("__ping__")
        except Exception:
            pass  # Connection already closed

    ping_task = asyncio.create_task(send_pings())

    try:
        while True:
            msg = await websocket.receive_text()
            if msg == "__pong__":
                continue  # Ignore pong responses from the client
            # Handle real messages here
            await websocket.send_text(f"echo: {msg}")
    except WebSocketDisconnect:
        pass
    finally:
        ping_task.cancel()

On the client side, respond to __ping__ with __pong__. You can also use the lower-level WebSocket ping frames via websocket.send_bytes with a specific frame type if you control both ends.

If you’re using NGINX as a reverse proxy, also set these directives to give WebSocket connections room to breathe:

location /ws {
    proxy_pass http://localhost:8000;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 3600s;  # 1 hour — adjust to your needs
    proxy_send_timeout 3600s;
}

Cause #4: Message Size Exceeding the Limit

Starlette has a default maximum WebSocket message size. Sending a message larger than this limit causes the server to close the connection with code 1009 (Message Too Big) — and if you’re not logging close codes, it just looks like a random drop.

The default max size in Starlette is 16 MB for text messages, but this can be overridden at the ASGI app level. More commonly, the limit comes from your reverse proxy or load balancer.

To raise the limit in Starlette/FastAPI:

from fastapi import FastAPI
from starlette.websockets import WebSocket

# Raise the limit to 64 MB (adjust as appropriate)
app = FastAPI(
    # Starlette passes this to the WebSocket handler via the ASGI scope
)

# Or configure at the WebSocket level:
@app.websocket("/ws")
async def ws(websocket: WebSocket):
    await websocket.accept()
    # Use receive_bytes for binary data, receive_text for text
    data = await websocket.receive_bytes()
    ...

For large file transfers over WebSocket, don’t send the whole thing at once. Instead, break the payload into chunks:

@app.websocket("/ws/upload")
async def ws_chunked_upload(websocket: WebSocket):
    await websocket.accept()
    chunks = []
    try:
        while True:
            chunk = await websocket.receive_bytes()
            if chunk == b"__done__":
                break
            chunks.append(chunk)
        full_payload = b"".join(chunks)
        await websocket.send_text(f"received {len(full_payload)} bytes")
    except WebSocketDisconnect:
        pass

This is more resilient, uses less memory, and avoids hitting any message size limits.

Cause #5: Mutable State in Broadcast Loops

Real-time apps often maintain a connected_clients set or list and broadcast messages to everyone. The classic bug: you’re iterating over connected_clients to broadcast, and a different coroutine modifies the set at the same time — either by a new connection being added or a disconnecting client being removed.

This raises a RuntimeError: Set changed size during iteration or, worse, silently skips clients:

# Broken broadcast implementation
from fastapi import FastAPI, WebSocket
from starlette.websockets import WebSocketDisconnect

app = FastAPI()
connected: set[WebSocket] = set()

@app.websocket("/ws")
async def ws(websocket: WebSocket):
    await websocket.accept()
    connected.add(websocket)
    try:
        while True:
            msg = await websocket.receive_text()
            # BUG: iterating while other coroutines add/remove connections
            for client in connected:
                await client.send_text(msg)
    except WebSocketDisconnect:
        connected.discard(websocket)

The fix — copy the set before iterating:

from fastapi import FastAPI, WebSocket
from starlette.websockets import WebSocketDisconnect
import logging

logger = logging.getLogger(__name__)
app = FastAPI()
connected: set[WebSocket] = set()

@app.websocket("/ws")
async def ws(websocket: WebSocket):
    await websocket.accept()
    connected.add(websocket)
    try:
        while True:
            msg = await websocket.receive_text()
            await broadcast(msg)
    except WebSocketDisconnect:
        connected.discard(websocket)

async def broadcast(message: str):
    # Snapshot the set so modifications during iteration don't cause issues
    dead_connections = set()
    for client in set(connected):  # iterate over a copy
        try:
            await client.send_text(message)
        except Exception:
            logger.warning("Could not send to client, marking for removal")
            dead_connections.add(client)
    connected -= dead_connections

For production apps with many concurrent connections, consider using a proper pub/sub backend — Redis with aioredis or a message queue — instead of in-memory sets. This also lets you scale across multiple Uvicorn workers.

Still Not Working?

If none of the above causes match your situation, here are some less common culprits:

Missing await websocket.accept(): If you forget to call accept() before receive_*() or send_*(), the handshake never completes and the client’s connection attempt fails immediately. Always call await websocket.accept() as the first line after async def ws(websocket: WebSocket).

TLS/SSL certificate issues: If your app is behind a proxy that terminates SSL, make sure the proxy is configured to upgrade WebSocket connections (see the NGINX config above). A misconfigured proxy will return a 426 Upgrade Required HTTP error instead of upgrading to WS.

Client reconnect logic masking the real problem: If your client automatically reconnects on disconnect, you might think the connection is “dropping” when in fact it’s the client cleanly reconnecting on a schedule. Check client-side logs and the disconnect code — code 1000 means a clean close, while 1006 means the connection was lost abnormally.

Uvicorn worker restart: If you’re running multiple Uvicorn workers (via Gunicorn with UvicornWorker), and a worker restarts due to an OOM error or unhandled exception, all WebSocket connections on that worker will drop. Use --workers 1 during debugging to rule this out, and check for memory leaks in long-lived connection handlers.

Dependency injection with Depends() in WebSocket routes: Async dependencies that do I/O (like database sessions) work fine in WebSocket routes, but if the dependency raises an exception, it happens before accept() is called and the connection closes with no useful message. Wrap WebSocket dependencies in try/except and log failures explicitly.

Summary Checklist

Run through this list when your FastAPI WebSocket connections drop unexpectedly:

[ ] Catch WebSocketDisconnect explicitly in every WebSocket handler
[ ] Use await loop.run_in_executor() for any synchronous blocking calls inside async handlers
[ ] Implement application-level ping-pong with an interval shorter than your proxy’s idle timeout
[ ] Set proxy_read_timeout and proxy_send_timeout in your NGINX config
[ ] Iterate over set(connected) (a copy) in broadcast loops, not the live set
[ ] Wrap individual send_* calls in try/except to handle per-client failures gracefully
[ ] Add structured logging with the close code on every disconnect
[ ] Chunk large payloads instead of sending them as a single message
[ ] Verify await websocket.accept() is called before any receive/send

If you’re debugging a tricky crash and the error traceback is hard to read, use Debugly’s trace formatter to quickly parse and analyze Python tracebacks — it highlights the relevant frames and makes the root cause much easier to spot.

WebSocket bugs in FastAPI are frustrating precisely because they manifest as silent drops rather than loud exceptions. Once you’re catching WebSocketDisconnect, keeping the event loop unblocked, and handling shared state safely, the connections become stable. Good luck with the debugging — and if you hit another common FastAPI issue along the way, check out our guide on FastAPI BackgroundTasks failures for more async-related pitfalls to watch out for.

TLDR: Common Reasons FastAPI WebSockets Drop

Diagnosing the Drop

Cause #1: Unhandled WebSocketDisconnect

Cause #2: Blocking Synchronous Code in the Handler

Cause #3: Missing Keepalive / Ping-Pong

Cause #4: Message Size Exceeding the Limit

Cause #5: Mutable State in Broadcast Loops

Still Not Working?

Summary Checklist

Related Posts

fastapi backgroundtasks failures fix

Troubleshooting Pydantic Settings ValidationError in FastAPI

How to Fix 404 Not Found in FastAPI: Complete Guide

Cause #1: Unhandled `WebSocketDisconnect`