Generate Speech
Generates audio from the input text.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Headers
Optional metadata for the request
Body
Text to convert to speech and speech generation options
The text to generate audio for. The maximum length is 4096 characters.
The TTS model to use (e.g. tts-1, tts-1-hd, gpt-4o-mini-tts).
The voice to use for single-speaker TTS. Can be a string (OpenAI format: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse) or an object with name and languageCode (Vertex AI format).
Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.
The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm. Default: mp3.
The speed of the generated audio. Select a value from 0.25 to 4.0. Default: 1.0.
The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd. Default: audio.
Response
Audio generated successfully
The audio file content