Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Schema for SpeechRequest.
Fields:
One of the available TTS models: tts-1, tts-1-hd, gpt-4o-mini-tts, or gpt-4o-mini-tts-2025-12-15.
The text to generate audio for. The maximum length is 4096 characters.
4096The voice to use when generating the audio. Supported built-in voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, and cedar. You may also provide a custom voice object with an id, for example { "id": "voice_1234" }. Previews of the voices are available in the Text to speech guide.
Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.
4096The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.
mp3, opus, aac, flac, wav, pcm The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
0.25 <= x <= 4The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.
sse, audio Audio file stream
The response is of type file.