Build voice workflows with one SDK: transcribe audio, translate it to English, and synthesize speech.
Use openai/gpt-4o-transcribe for transcription and openai/gpt-4o-mini-tts for speech
generation.
Speech to Text
from dedalus_labs import AsyncDedalus
client = AsyncDedalus()
with open("audio.mp3", "rb") as audio_file:
transcription = await client.audio.transcriptions.create(
model="openai/gpt-4o-transcribe",
file=audio_file,
)
print(transcription.text)
Audio Translation
from dedalus_labs import AsyncDedalus
client = AsyncDedalus()
with open("spanish-audio.mp3", "rb") as audio_file:
translation = await client.audio.translations.create(
model="openai/gpt-4o-mini-transcribe",
file=audio_file,
)
print(translation.text)
Text to Speech
from dedalus_labs import AsyncDedalus
client = AsyncDedalus()
speech = await client.audio.speech.create(
model="openai/gpt-4o-mini-tts",
voice="alloy",
input="Hello from Dedalus",
)
speech.stream_to_file("speech.mp3")
API endpoints
Speech
POST /v1/audio/speech
Transcriptions
POST /v1/audio/transcriptions
Translations
POST /v1/audio/translations