Transcribes audio into the input language.
Authentication: This endpoint accepts either a Bearer API key or an X-Sign-In-With-X header for x402 wallet-based authentication. When using x402, a 402 Payment Required response indicates insufficient balance and includes top-up instructions.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Request to transcribe audio to text.
The audio file object (not a base64 string). Supported formats: WAV, WAVE, FLAC, M4A, AAC, MP4, MP3, OGG, WEBM.
The model to use for transcription. See https://docs.venice.ai/models/overview for more information.
nvidia/parakeet-tdt-0.6b-v3, openai/whisper-large-v3 "nvidia/parakeet-tdt-0.6b-v3"
The format of the transcript output, in one of these options: json, text.
json, text "json"
Whether to include timestamps in the response.
false
ISO 639-1 language code (e.g., "en", "es", "fr"). Optional - if not provided, the model will auto-detect the language. Note: Only supported by certain models (e.g., Whisper). Ignored by models that do not support language hints.
"en"