跳转到主要内容
构建一个具有流式传输响应、模型切换和 MCP 服务器集成的实时聊天界面。本示例使用 FastAPI 和 WebSockets 创建一个完整的 Web 应用程序。

工作原理

该服务器结合了多种模式:
  1. WebSocket 连接,用于实时双向通信
  2. 内存中的会话,为每个客户端维护对话历史
  3. 流式传输响应,在 token 到达时即时显示
  4. 动态配置,用于选择模型和 MCP 服务器

关键概念

WebSocket 聊天流程

会话管理

每个 WebSocket 连接都会使用一个会话 ID,用于维护各自独立的对话历史:
sessions: dict[str, list[dict]] = {}

# 在 WebSocket 处理程序中
if session_id not in sessions:
    sessions[session_id] = []

# 添加用户消息
sessions[session_id].append({"role": "user", "content": message})

# 响应后,保存助理消息
sessions[session_id].append({"role": "assistant", "content": full_response})

流式传输到 WebSocket

Runner 的流式传输响应会以分块方式转发给 Client:
response_stream = runner.run(messages=history, model=model, stream=True)

async for chunk in response_stream:
    if hasattr(chunk, "choices") and chunk.choices:
        delta = chunk.choices[0].delta
        if hasattr(delta, "content") and delta.content:
            await websocket.send_json({
                "type": "chunk",
                "content": delta.content
            })

完整示例

一个带有极简 UI 的全栈聊天应用:
"""
带 UI 的 FastAPI 聊天服务器
===========================
具有模型和 MCP 服务器选择功能的全栈聊天应用程序。

运行: uv run --python 3.13 cookbook/02_chat_server.py
然后打开: http://localhost:8000
"""

import asyncio
import json
from contextlib import asynccontextmanager

from dotenv import load_dotenv
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from fastapi.responses import HTMLResponse
import uvicorn

from dedalus_labs import AsyncDedalus, DedalusRunner

load_dotenv()

# In-memory session storage (use Redis/DB in production)
sessions: dict[str, list[dict]] = {}


@asynccontextmanager
async def lifespan(app: FastAPI):
    print("\n" + "=" * 50)
    print("  Dedalus Chat Server")
    print("  Open http://localhost:8000")
    print("=" * 50 + "\n")
    yield


app = FastAPI(lifespan=lifespan)


HTML_PAGE = """
<!DOCTYPE html>
<html>
<head>
    <title>Dedalus Chat</title>
    <style>
        * { box-sizing: border-box; margin: 0; padding: 0; }
        body {
            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
            background: #fff; color: #111; height: 100vh;
            display: flex; flex-direction: column;
        }
        .header {
            padding: 12px 24px;
            border-bottom: 1px solid #e5e5e5;
            display: flex; gap: 24px; align-items: center;
        }
        .header h1 { font-size: 16px; font-weight: 600; margin-right: auto; }
        .header select, .header input {
            padding: 8px 12px; border-radius: 6px; border: 1px solid #d1d1d1;
            background: #fff; color: #111; font-size: 13px;
        }
        .header select:focus, .header input:focus { outline: none; border-color: #111; }
        .config { display: flex; align-items: center; gap: 8px; }
        .config label { font-size: 12px; color: #666; }

        .chat-container {
            flex: 1; overflow-y: auto; padding: 24px;
            max-width: 800px; margin: 0 auto; width: 100%;
        }
        .message { margin-bottom: 24px; line-height: 1.6; }
        .message .role { font-size: 12px; font-weight: 600; margin-bottom: 4px; text-transform: uppercase; color: #666; }
        .message .content { white-space: pre-wrap; }
        .message.user .role { color: #111; }
        .message.assistant .content { color: #333; }
        .message.system { text-align: center; color: #999; font-size: 13px; }

        .input-container {
            padding: 16px 24px; border-top: 1px solid #e5e5e5;
            max-width: 800px; margin: 0 auto; width: 100%;
            display: flex; gap: 12px;
        }
        .input-container input {
            flex: 1; padding: 12px 16px; border-radius: 8px;
            border: 1px solid #d1d1d1; font-size: 15px;
        }
        .input-container input:focus { outline: none; border-color: #111; }
        .input-container button {
            padding: 12px 24px; border-radius: 8px; border: 1px solid #111;
            background: #111; color: #fff; font-size: 14px;
            cursor: pointer; font-weight: 500;
        }
        .input-container button:hover { background: #333; }
        .input-container button:disabled { background: #999; border-color: #999; cursor: not-allowed; }
        .typing .content::after { content: '...'; animation: dots 1s infinite; }
        @keyframes dots { 0%,20%{content:'.'} 40%{content:'..'} 60%,100%{content:'...'} }
    </style>
</head>
<body>
    <div class="header">
        <h1>Dedalus</h1>
        <div class="config">
            <label>Model</label>
            <select id="model">
                <option value="openai/gpt-5.1">GPT-5.1</option>
                <option value="anthropic/claude-opus-4-5-20251101">Opus 4.5</option>
                <option value="google/gemini-3-pro-preview">Gemini 3</option>
            </select>
        </div>
        <div class="config">
            <label>MCP</label>
            <input type="text" id="mcp" placeholder="server slug or URL" style="width:200px">
        </div>
    </div>

    <div class="chat-container" id="chat"></div>

    <div class="input-container">
        <input type="text" id="input" placeholder="Message..." autofocus>
        <button id="send">Send</button>
    </div>

    <script>
        const chat = document.getElementById('chat');
        const input = document.getElementById('input');
        const sendBtn = document.getElementById('send');
        const modelSelect = document.getElementById('model');
        const mcpInput = document.getElementById('mcp');

        let ws = null;
        let sessionId = 'session_' + Date.now();

        function connect() {
            ws = new WebSocket(`ws://${location.host}/ws/${sessionId}`);

            ws.onmessage = (event) => {
                const data = JSON.parse(event.data);

                if (data.type === 'start') {
                    const msg = document.createElement('div');
                    msg.className = 'message assistant typing';
                    msg.id = 'typing';
                    msg.innerHTML = '<div class="role">Assistant</div><div class="content"></div>';
                    chat.appendChild(msg);
                } else if (data.type === 'chunk') {
                    const typing = document.getElementById('typing');
                    if (typing) {
                        typing.classList.remove('typing');
                        typing.querySelector('.content').textContent += data.content;
                    }
                } else if (data.type === 'done') {
                    const typing = document.getElementById('typing');
                    if (typing) typing.removeAttribute('id');
                    sendBtn.disabled = false;
                    input.focus();
                } else if (data.type === 'error') {
                    addMessage('Error: ' + data.message, 'system');
                    sendBtn.disabled = false;
                }
                chat.scrollTop = chat.scrollHeight;
            };

            ws.onclose = () => setTimeout(connect, 1000);
        }

        function addMessage(text, role) {
            const msg = document.createElement('div');
            msg.className = `message ${role}`;
            if (role === 'system') {
                msg.textContent = text;
            } else {
                msg.innerHTML = `<div class="role">${role === 'user' ? 'You' : 'Assistant'}</div><div class="content">${text}</div>`;
            }
            chat.appendChild(msg);
            chat.scrollTop = chat.scrollHeight;
        }

        function send() {
            const text = input.value.trim();
            if (!text || !ws || ws.readyState !== WebSocket.OPEN) return;

            addMessage(text, 'user');
            input.value = '';
            sendBtn.disabled = true;

            ws.send(JSON.stringify({
                message: text,
                model: modelSelect.value,
                mcp_servers: mcpInput.value ? [mcpInput.value] : []
            }));
        }

        sendBtn.onclick = send;
        input.onkeydown = (e) => { if (e.key === 'Enter') send(); };
        connect();
    </script>
</body>
</html>
"""


@app.get("/")
async def get_ui():
    return HTMLResponse(HTML_PAGE)


@app.websocket("/ws/{session_id}")
async def websocket_chat(websocket: WebSocket, session_id: str):
    await websocket.accept()

    if session_id not in sessions:
        sessions[session_id] = []

    client = AsyncDedalus()
    runner = DedalusRunner(client)

    try:
        while True:
            data = await websocket.receive_json()
            message = data.get("message", "")
            model = data.get("model", "openai/gpt-5.1")
            mcp_servers = data.get("mcp_servers", [])

            await websocket.send_json({"type": "start"})

            try:
                # Append user message to history first
                sessions[session_id].append({"role": "user", "content": message})
                history = sessions[session_id]

                kwargs = {
                    "messages": history,
                    "model": model,
                    "stream": True,
                }
                if mcp_servers:
                    kwargs["mcp_servers"] = mcp_servers

                response_stream = runner.run(**kwargs)

                full_response = ""
                async for chunk in response_stream:
                    if hasattr(chunk, "choices") and chunk.choices:
                        delta = chunk.choices[0].delta
                        if hasattr(delta, "content") and delta.content:
                            full_response += delta.content
                            await websocket.send_json({
                                "type": "chunk",
                                "content": delta.content
                            })

                # 将助理响应保存到会话
                sessions[session_id].append({"role": "assistant", "content": full_response})

                await websocket.send_json({"type": "done"})

            except Exception as e:
                await websocket.send_json({"type": "error", "message": str(e)})

    except WebSocketDisconnect:
        pass


if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

运行服务器

# 安装依赖
pip install fastapi uvicorn websockets python-dotenv dedalus-labs

# Run the server
python chat_server.py
然后在浏览器中打开 http://localhost:8000

生产环境注意事项

关注点解决方案
会话存储将内存 dict 替换为 Redis 或 PostgreSQL
认证在 WebSocket 握手中添加 JWT/OAuth 中间件
限流实现按用户维度的请求限流
错误处理添加重试逻辑和优雅降级
扩展性使用 Redis 发布/订阅机制支持多实例部署
通过模型上下文协议 (MCP),以编程方式将这些文档连接到 Claude、VSCode 等工具,以获得实时解答。 以编程方式连接这些文档