Transcribe audio into text.
Transcribes audio files using OpenAI’s Whisper model. Supports multiple audio formats including mp3, mp4, mpeg, mpga, m4a, wav, and webm. Maximum file size is 25 MB.
Args: file: Audio file to transcribe (required) model: Model ID to use (e.g., “openai/whisper-1”) language: ISO-639-1 language code (e.g., “en”, “es”) - improves accuracy prompt: Optional text to guide the model’s style response_format: Format of the output (json, text, srt, verbose_json, vtt) temperature: Sampling temperature between 0 and 1
Returns: Transcription object with the transcribed text
Documentation Index
Fetch the complete documentation index at: https://docs.dedaluslabs.ai/llms.txt
Use this file to discover all available pages before exploring further.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Successful Response
Represents a verbose json transcription response returned by model, based on the provided input.
Fields:
The language of the input audio.
The duration of the input audio.
The transcribed text.
Extracted words and their corresponding timestamps.
Segments of the transcribed text and their corresponding details.
Usage statistics for models billed by audio input duration.