Skip to main content
Build voice workflows with one SDK: transcribe audio, translate it to English, and synthesize speech.
Use openai/gpt-4o-transcribe for transcription and openai/gpt-4o-mini-tts for speech generation.

Speech to Text

from dedalus_labs import AsyncDedalus

client = AsyncDedalus()

with open("audio.mp3", "rb") as audio_file:
    transcription = await client.audio.transcriptions.create(
        model="openai/gpt-4o-transcribe",
        file=audio_file,
    )

print(transcription.text)

Audio Translation

from dedalus_labs import AsyncDedalus

client = AsyncDedalus()

with open("spanish-audio.mp3", "rb") as audio_file:
    translation = await client.audio.translations.create(
        model="openai/gpt-4o-mini-transcribe",
        file=audio_file,
    )

print(translation.text)

Text to Speech

from dedalus_labs import AsyncDedalus

client = AsyncDedalus()

speech = await client.audio.speech.create(
    model="openai/gpt-4o-mini-tts",
    voice="alloy",
    input="Hello from Dedalus",
)

speech.stream_to_file("speech.mp3")

API endpoints

Speech

POST /v1/audio/speech

Transcriptions

POST /v1/audio/transcriptions

Translations

POST /v1/audio/translations
Last modified on April 9, 2026