Dedalus Docs / Home

Authorizations

Authorization

string

header

required

API key authentication using Bearer token

Body

application/json

Chat completion request (OpenAI-compatible).

Stateless chat completion endpoint. For stateful conversations with threads, use the Responses API instead.

model

required

Model identifier string (e.g., 'openai/gpt-5', 'anthropic/claude-3-5-sonnet').

Example:

"openai/gpt-4o"

messages

required

Conversation history. Accepts either a list of message objects or a string, which is treated as a single user message.

Example:

[
  {
    "content": "Hello, how are you?",
    "role": "user"
  }
]

input

Convenience alias for Responses-style input. Used when messages is omitted to provide the user prompt directly.

Example:

"Translate this paragraph into French."

temperature

number | null

What sampling temperature to use, between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. We generally recommend altering this or 'top_p' but not both.

Required range: 0 <= x <= 2

Example:

0

top_p

number | null

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or 'temperature' but not both.

Required range: 0 <= x <= 1

Example:

0.1

max_tokens

integer | null

The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API. This value is now deprecated in favor of 'max_completion_tokens' and is not compatible with o-series models.

Required range: x >= 1

Example:

100

presence_penalty

number | null

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Required range: -2 <= x <= 2

Example:

-0.5

frequency_penalty

number | null

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Required range: -2 <= x <= 2

Example:

-0.5

logit_bias

Logit Bias · object

Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object mapping token IDs (as strings) to bias values from -100 to 100. The bias is added to the logits before sampling; values between -1 and 1 nudge selection probability, while values like -100 or 100 effectively ban or require a token.

Show child attributes

Example:

{ "50256": -100 }

stop

string[] | null

Not supported with latest reasoning models 'o3' and 'o4-mini'.

    Up to 4 sequences where the API will stop generating further tokens; the returned text will not contain the stop sequence.

Example:

["\n", "END"]

thinking

ThinkingConfigDisabled · object

Fields:

type (required): Literal['disabled']

ThinkingConfigDisabled
ThinkingConfigEnabled

Show child attributes

Example:

{ "budget_tokens": 2048, "type": "enabled" }

top_k

integer | null

Top-k sampling. Anthropic: pass-through. Google: injected into generationConfig.topK.

Required range: x >= 0

Example:

40

system

System prompt/instructions. Anthropic: pass-through. Google: converted to systemInstruction. OpenAI: extracted from messages.

Example:

"You are a helpful assistant."

instructions

Convenience alias for Responses-style instructions. Takes precedence over system and over system-role messages when provided.

Example:

"You are a concise assistant."

generation_config

Generation Config · object

Google generationConfig object. Merged with auto-generated config. Use for Google-specific params (candidateCount, responseMimeType, etc.).

Example:

{
  "candidateCount": 2,
  "responseMimeType": "application/json"
}

safety_settings

Safety Settings · object[] | null

Google safety settings (harm categories and thresholds).

Example:

[
  {
    "category": "HARM_CATEGORY_HARASSMENT",
    "threshold": "BLOCK_NONE"
  }
]

tool_config

Tool Config · object

Google tool configuration (function calling mode, etc.).

Example:

{
  "function_calling_config": { "mode": "ANY" }
}

disable_automatic_function_calling

boolean | null

Google-only flag to disable the SDK's automatic function execution. When true, the model returns function calls for the client to execute manually.

Example:

true

seed

integer | null

If specified, system will make a best effort to sample deterministically. Determinism is not guaranteed for the same seed across different models or API versions.

Example:

42

user

string | null

Stable identifier for your end-users. Helps OpenAI detect and prevent abuse and may boost cache hit rates. This field is being replaced by 'safety_identifier' and 'prompt_cache_key'.

Example:

"user-123"

integer | null

How many chat completion choices to generate for each input message. Keep 'n' as 1 to minimize costs.

Required range: 1 <= x <= 128

Example:

1

stream

boolean

default:false

If true, the model response data is streamed to the client as it is generated using Server-Sent Events.

Examples:

true

false

stream_options

Stream Options · object

Options for streaming responses. Only set when 'stream' is true (supports 'include_usage' and 'include_obfuscation').

Example:

{ "include_usage": true }

response_format

Response Format · object

An object specifying the format that the model must output. Use {'type': 'json_schema', 'json_schema': {...}} for structured outputs or {'type': 'json_object'} for the legacy JSON mode. Currently only OpenAI-prefixed models honour this field; Anthropic and Google requests will return an invalid_request_error if it is supplied.

Example:

{ "type": "text" }

tools

Tools · object[] | null

A list of tools the model may call. Supports OpenAI function tools and custom tools; use 'mcp_servers' for Dedalus-managed server-side tools.

Example:

[
  {
    "function": {
      "description": "Get current weather for a location",
      "name": "get_weather",
      "parameters": {
        "properties": {
          "location": {
            "description": "City name",
            "type": "string"
          }
        },
        "required": ["location"],
        "type": "object"
      }
    },
    "type": "function"
  }
]

tool_choice

Controls which (if any) tool is called by the model. 'none' stops tool calling, 'auto' lets the model decide, and 'required' forces at least one tool invocation. Specific tool payloads force that tool.

Example:

"auto"

parallel_tool_calls

boolean | null

Whether to enable parallel function calling during tool use.

Example:

true

functions

Functions · object[] | null

Deprecated in favor of 'tools'. Legacy list of function definitions the model may generate JSON inputs for.

function_call

Deprecated in favor of 'tool_choice'. Controls which function is called by the model (none, auto, or specific name).

logprobs

boolean | null

Whether to return log probabilities of the output tokens. If true, returns the log probabilities for each token in the response content.

Example:

true

top_logprobs

integer | null

An integer between 0 and 20 specifying how many of the most likely tokens to return at each position, with log probabilities. Requires 'logprobs' to be true.

Required range: 0 <= x <= 20

Example:

5

max_completion_tokens

integer | null

An upper bound for the number of tokens that can be generated for a completion, including visible output and reasoning tokens.

Required range: x >= 1

Example:

1000

reasoning_effort

enum<string> | null

Constrains effort on reasoning for supported reasoning models. Higher values use more compute, potentially improving reasoning quality at the cost of latency and tokens.

Available options:

low,

medium,

high

Example:

"medium"

audio

Audio · object

Parameters for audio output. Required when requesting audio responses (for example, modalities including 'audio').

Example:

{ "format": "mp3", "voice": "alloy" }

modalities

string[] | null

Output types you would like the model to generate. Most models default to ['text']; some support ['text', 'audio'].

Example:

["text"]

prediction

Prediction · object

Configuration for predicted outputs. Improves response times when you already know large portions of the response content.

metadata

Metadata · object

Set of up to 16 key-value string pairs that can be attached to the request for structured metadata.

Show child attributes

Example:

{ "session": "abc", "user_id": "123" }

store

boolean | null

Whether to store the output of this chat completion request for OpenAI model distillation or eval products. Image inputs over 8MB are dropped if storage is enabled.

Example:

true

service_tier

enum<string> | null

Specifies the processing tier used for the request. 'auto' uses project defaults, while 'default' forces standard pricing and performance.

Available options:

auto,

default

Example:

"auto"

prompt_cache_key

string | null

Used by OpenAI to cache responses for similar requests and optimize cache hit rates. Replaces the legacy 'user' field for caching.

safety_identifier

string | null

Stable identifier used to help detect users who might violate OpenAI usage policies. Consider hashing end-user identifiers before sending.

verbosity

enum<string> | null

Constrains the verbosity of the model's response. Lower values produce concise answers, higher values allow more detail.

Available options:

low,

medium,

high

web_search_options

Web Search Options · object

Configuration for OpenAI's web search tool. Learn more at https://platform.openai.com/docs/guides/tools-web-search?api-mode=chat.

search_parameters

Search Parameters · object

xAI-specific parameter for configuring web search data acquisition. If not set, no data will be acquired by the model.

deferred

boolean | null

xAI-specific parameter. If set to true, the request returns a request_id for async completion retrieval via GET /v1/chat/deferred-completion/{request_id}.

mcp_servers

MCP (Model Context Protocol) server addresses to make available for server-side tool execution. Entries can be URLs (e.g., 'https://mcp.example.com'), slugs (e.g., 'dedalus-labs/brave-search'), or structured objects specifying slug/version/url. MCP tools are executed server-side and billed separately.

Example:

[
  "dedalus-labs/brave-search",
  "dedalus-labs/github-api"
]

guardrails

Guardrails · object[] | null

Guardrails to apply to the agent for input/output validation and safety checks. Reserved for future use - guardrails configuration format not yet finalized.

handoff_config

Handoff Config · object

Configuration for multi-model handoffs and agent orchestration. Reserved for future use - handoff configuration format not yet finalized.

model_attributes

Model Attributes · object

Attributes for individual models used in routing decisions during multi-model execution. Format: {'model_name': {'attribute': value}}, where values are 0.0-1.0. Common attributes: 'intelligence', 'speed', 'cost', 'creativity', 'accuracy'. Used by agent to select optimal model based on task requirements.

Show child attributes

Example:

{
  "anthropic/claude-3-5-sonnet": {
    "cost": 0.7,
    "creativity": 0.8,
    "intelligence": 0.95
  },
  "openai/gpt-4": {
    "cost": 0.8,
    "intelligence": 0.9,
    "speed": 0.6
  },
  "openai/gpt-4o-mini": {
    "cost": 0.2,
    "intelligence": 0.7,
    "speed": 0.9
  }
}

agent_attributes

Agent Attributes · object

Attributes for the agent itself, influencing behavior and model selection. Format: {'attribute': value}, where values are 0.0-1.0. Common attributes: 'complexity', 'accuracy', 'efficiency', 'creativity', 'friendliness'. Higher values indicate stronger preference for that characteristic.

Show child attributes

Example:

{
  "accuracy": 0.9,
  "complexity": 0.8,
  "efficiency": 0.7
}

max_turns

integer | null

Maximum number of turns for agent execution before terminating (default: 10). Each turn represents one model inference cycle. Higher values allow more complex reasoning but increase cost and latency.

Required range: 1 <= x <= 100

Example:

5

auto_execute_tools

boolean

default:true

When False, skip server-side tool execution and return raw OpenAI-style tool_calls in the response.

Examples:

true

false

Response

JSON or SSE stream of ChatCompletionChunk events

Chat completion response for Dedalus API.

OpenAI-compatible chat completion response with Dedalus extensions. Maintains full compatibility with OpenAI API while providing additional features like server-side tool execution tracking and MCP error reporting.

string

required

A unique identifier for the chat completion.

choices

Choice · object[]

required

A list of chat completion choices. Can be more than one if n is greater than 1.

Show child attributes

created

integer

required

The Unix timestamp (in seconds) of when the chat completion was created.

model

string

required

The model used for the chat completion.

object

string

required

The object type, which is always chat.completion.

Allowed value: "chat.completion"

service_tier

enum<string> | null

Specifies the processing type used for serving the request.

If set to 'auto', then the request will be processed with the service tier configured in the Project settings. Unless otherwise configured, the Project will use 'default'.
If set to 'default', then the request will be processed with the standard pricing and performance for the selected model.
If set to 'flex' or 'priority', then the request will be processed with the corresponding service tier.
When not set, the default behavior is 'auto'.

When the service_tier parameter is set, the response body will include the service_tier value based on the processing mode actually used to serve the request. This response value may be different from the value set in the parameter.

Available options:

auto,

default,

flex,

scale,

priority

system_fingerprint

string

This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.

usage

CompletionUsage · object

Usage statistics for the completion request.

Show child attributes

tools_executed

string[] | null

List of tool names that were executed server-side (e.g., MCP tools). Only present when tools were executed on the server rather than returned for client-side execution.

mcp_server_errors

Mcp Server Errors · object

Information about MCP server failures, if any occurred during the request. Contains details about which servers failed and why, along with recommendations for the user. Only present when MCP server failures occurred.

Overview

Endpoints

Schemas

Create Chat Completion

Authorizations

Body

Response