Create Speech
Documentation Index
Fetch the complete documentation index at: https://docs.dedaluslabs.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Generate audio from text using text-to-speech models. Currently supports OpenAI’s TTS models with multiple voice options. Note: OpenAI only endpoint.Usage Examples
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Schema for SpeechRequest.
Fields:
- model (required): str | Literal["tts-1", "tts-1-hd", "gpt-4o-mini-tts", "gpt-4o-mini-tts-2025-12-15"]
- input (required): Annotated[str, StringConstraints(max_length=4096)]
- instructions (optional): Annotated[str, StringConstraints(max_length=4096)]
- voice (required): VoiceIdsOrCustomVoice
- response_format (optional): Literal["mp3", "opus", "aac", "flac", "wav", "pcm"]
- speed (optional): float
- stream_format (optional): Literal["sse", "audio"]
One of the available TTS models: tts-1, tts-1-hd, gpt-4o-mini-tts, or gpt-4o-mini-tts-2025-12-15.
The text to generate audio for. The maximum length is 4096 characters.
4096The voice to use when generating the audio. Supported built-in voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, and cedar. You may also provide a custom voice object with an id, for example { "id": "voice_1234" }. Previews of the voices are available in the Text to speech guide.
Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.
4096The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.
mp3, opus, aac, flac, wav, pcm The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
0.25 <= x <= 4The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.
sse, audio Response
Audio file stream
The response is of type file.
