> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dedaluslabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Streaming

> Display responses as they're generated

Streaming shows output token-by-token instead of waiting for the complete response. Users see progress immediately, which matters for longer outputs or interactive applications.

## Stream in one line

Set `stream=True` so users see progress as the agent works.

<CodeGroup>
  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import asyncio
  from dedalus_labs import AsyncDedalus, DedalusRunner
  from dedalus_labs.utils.stream import stream_async
  from dotenv import load_dotenv

  load_dotenv()

  async def main():
      client = AsyncDedalus()
      runner = DedalusRunner(client)

      stream = runner.run(
          input="Find me the nearest basketball games in January in San Francisco (stream your work).",
          model="anthropic/claude-opus-4-5",
          mcp_servers=["tsion/exa"],  # Web search via Exa
          stream=True,
      )

      await stream_async(stream)

  if __name__ == "__main__":
      asyncio.run(main())
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import Dedalus from "dedalus-labs";
  import { DedalusRunner } from "dedalus-labs";

  const client = new Dedalus();
  const runner = new DedalusRunner(client, true);

  async function main() {
  	const result = await runner.run({
  		input: "Find me the nearest basketball games in January in San Francisco (stream your work).",
  		model: "anthropic/claude-opus-4-5",
  		mcpServers: ["tsion/exa"], // Web search via Exa
  		stream: true,
  	});

  	if (Symbol.asyncIterator in result) {
  		for await (const chunk of result) {
  			if (chunk.choices?.[0]?.delta?.content) {
  				process.stdout.write(chunk.choices[0].delta.content);
  			}
  		}
  	}
  }

  main();
  ```
</CodeGroup>

## Streaming with Tools

Streaming works with tool-calling workflows. You can stream while the agent calls **local tools**, **MCPs**, or both.

<CodeGroup>
  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import asyncio

  from dedalus_labs import AsyncDedalus, DedalusRunner
  from dedalus_labs.utils.stream import stream_async
  from dotenv import load_dotenv

  load_dotenv()

  def summarize_headlines(headlines: list[str]) -> str:
      """Format headlines as a short bullet list."""
      return "\n".join(f"• {h}" for h in headlines[:3])

  async def main():
      client = AsyncDedalus()
      runner = DedalusRunner(client)

      stream = runner.run(
          input=(
              "Search for AI news. Extract 3 headlines. "
              "Then call summarize_headlines(headlines) and stream your final answer."
          ),
          model="openai/gpt-5.2",
          mcp_servers=["windsor/brave-search-mcp"],  # Web search via Brave Search MCP
          tools=[summarize_headlines],
          stream=True,
      )

      await stream_async(stream)

  if __name__ == "__main__":
      asyncio.run(main())
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import Dedalus from "dedalus-labs";
  import { DedalusRunner } from "dedalus-labs";

  function summarizeHeadlines(headlines: string[]): string {
  	return headlines
  		.slice(0, 3)
  		.map((h) => `• ${h}`)
  		.join("\n");
  }

  const client = new Dedalus();
  const runner = new DedalusRunner(client, true);

  async function main() {
  	const result = await runner.run({
  		input:
  			"Search for AI news. Extract 3 headlines. Then call summarizeHeadlines(headlines) and stream your final answer.",
  		model: "openai/gpt-5.2",
  		mcpServers: ["windsor/brave-search-mcp"], // Web search via Brave Search MCP
  		tools: [summarizeHeadlines],
  		stream: true,
  	});

  	if (Symbol.asyncIterator in result) {
  		for await (const chunk of result) {
  			if (chunk.choices?.[0]?.delta?.content) {
  				process.stdout.write(chunk.choices[0].delta.content);
  			}
  		}
  	}
  }

  main();
  ```
</CodeGroup>

## Compare: non-streaming vs streaming (same scenario)

The scenario below is the same in both snippets. The only difference is whether you set `stream=True` **and iterate over the stream**.

<Note>
  In Python, **non-streaming** refers to `stream=False`, not “sync”. If you use `AsyncDedalus`,
  you’ll still write async code and use `asyncio.run(...)`. If you prefer fully synchronous code,
  use the `Dedalus` client (example below).
</Note>

### Python

<CodeGroup>
  ```python Non-streaming (AsyncDedalus) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import asyncio
  from dedalus_labs import AsyncDedalus, DedalusRunner
  from dotenv import load_dotenv

  load_dotenv()

  async def main():
      client = AsyncDedalus()
      runner = DedalusRunner(client)

      result = await runner.run(
          input="Find me the nearest basketball games in January in San Francisco.",
          model="anthropic/claude-opus-4-5",
          mcp_servers=["tsion/exa"],  # Web search via Exa
      )

      # You only see output after the full run completes.
      print(result.final_output)

  if __name__ == "__main__":
      asyncio.run(main())
  ```

  ```python Streaming (AsyncDedalus) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import asyncio
  from dedalus_labs import AsyncDedalus, DedalusRunner
  from dedalus_labs.utils.stream import stream_async
  from dotenv import load_dotenv

  load_dotenv()

  async def main():
      client = AsyncDedalus()
      runner = DedalusRunner(client)

      stream = runner.run(
          input="Find me the nearest basketball games in January in San Francisco.",
          model="anthropic/claude-opus-4-5",
          mcp_servers=["tsion/exa"],  # Web search via Exa
          stream=True,
      )

      # You see output as the model generates it.
      await stream_async(stream)

  if __name__ == "__main__":
      asyncio.run(main())
  ```
</CodeGroup>

### Python (sync client)

<CodeGroup>
  ```python Non-streaming (Dedalus) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from dedalus_labs import Dedalus, DedalusRunner
  from dotenv import load_dotenv

  load_dotenv()

  def main():
      client = Dedalus()
      runner = DedalusRunner(client)

      result = runner.run(
          input="Find me the nearest basketball games in January in San Francisco.",
          model="anthropic/claude-opus-4-5",
          mcp_servers=["tsion/exa"],  # Web search via Exa
      )

      print(result.final_output)

  if __name__ == "__main__":
      main()
  ```

  ```python Streaming (Dedalus) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from dedalus_labs import Dedalus, DedalusRunner
  from dedalus_labs.utils.stream import stream_sync
  from dotenv import load_dotenv

  load_dotenv()

  def main():
      client = Dedalus()
      runner = DedalusRunner(client)

      stream = runner.run(
          input="Find me the nearest basketball games in January in San Francisco.",
          model="anthropic/claude-opus-4-5",
          mcp_servers=["tsion/exa"],  # Web search via Exa
          stream=True,
      )

      stream_sync(stream)

  if __name__ == "__main__":
      main()
  ```
</CodeGroup>

### TypeScript

<CodeGroup>
  ```typescript Non-streaming theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import Dedalus from "dedalus-labs";
  import { DedalusRunner } from "dedalus-labs";

  const client = new Dedalus();
  const runner = new DedalusRunner(client, true);

  async function main() {
  	const result = await runner.run({
  		input: "Find me the nearest basketball games in January in San Francisco.",
  		model: "anthropic/claude-opus-4-5",
  		mcpServers: ["tsion/exa"], // Web search via Exa
  	});

  	console.log(result.finalOutput);
  }

  main();
  ```

  ```typescript Streaming theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import Dedalus from "dedalus-labs";
  import { DedalusRunner } from "dedalus-labs";

  const client = new Dedalus();
  const runner = new DedalusRunner(client, true);

  async function main() {
  	const result = await runner.run({
  		input: "Find me the nearest basketball games in January in San Francisco.",
  		model: "anthropic/claude-opus-4-5",
  		mcpServers: ["tsion/exa"], // Web search via Exa
  		stream: true,
  	});

  	if (Symbol.asyncIterator in result) {
  		for await (const chunk of result) {
  			if (chunk.choices?.[0]?.delta?.content) {
  				process.stdout.write(chunk.choices[0].delta.content);
  			}
  		}
  	}
  }

  main();
  ```
</CodeGroup>

## How the user experience differs

* **Progressive rendering**: you can display text as it arrives (“typing”), instead of waiting for a complete response.
* **Visible work**: in tool/MCP workflows, you can show status updates (e.g., “Searching Exa…”) while the agent is calling tools.
* **Interruptibility**: you can stop early (client-side) if the user already has what they need, instead of paying for a full completion.

## When to Stream

Stream when:

* Building chat interfaces where perceived latency matters
* Generating long-form content (articles, code, analysis)
* Running in terminals or logs where progress feedback helps

Don’t stream when:

* You need to parse the complete response before displaying
* Using structured outputs with `.parse()`
* Response time is already fast enough

## Next steps

* **Route across models**: [Handoffs](/sdk/handoffs) — Use fast/strong models by phase
* **Add images last**: [Images & Vision](/sdk/images) — Add multimodality when your text workflow is solid
* **See patterns**: [Use Cases](/sdk/use-cases/web-search-agent) — More streaming agent examples

<Tip>
  [Connect these docs programmatically](/contextual/use-these-docs) to Claude, VSCode, and more via
  MCP for real-time answers.
</Tip>
