Text-to-Speech

Portkey's AI gateway currently supports text-to-speech models on OpenAI and Azure OpenAI.

Usage

We follow the OpenAI signature where you can send the input text and the voice option as a part of the API request. All the output formats mp3, opus, aac, flac, and pcm are supported. Portkey also supports real time audio streaming for TTS models.

Here's an example:

import fs from "fs";
import path from "path";
import OpenAI from "openai";
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'

const openai = new OpenAI({
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "OPENAI_VIRTUAL_KEY"
  })
});

const speechFile = path.resolve("./speech.mp3");

async function main() {
  const mp3 = await openai.audio.speech.create({
    model: "tts-1",
    voice: "alloy",
    input: "Today is a wonderful day to build something people love!",
  });
  const buffer = Buffer.from(await mp3.arrayBuffer());
  await fs.promises.writeFile(speechFile, buffer);
}

main();

from pathlib import Path
from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        api_key="PORTKEY_API_KEY",
        virtual_key="OPENAI_VIRTUAL_KEY"
    )
)

speech_file_path = Path(__file__).parent / "speech.mp3"

response = client.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
)

f = open(speech_file_path, "wb")
f.write(response.content)
f.close()

curl "https://api.portkey.ai/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: $PORTKEY_API_KEY" \
  -H "x-portkey-provider: openai" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "tts-1",
    "input": "Today is a wonderful day to build something people love!",
    "voice": "alloy"
  }' \
  --output speech.mp3

from pathlib import Path
from portkey_ai import Portkey

# Initialize the Portkey client
portkey = Portkey(
    api_key="PORTKEY_API_KEY",  # Replace with your Portkey API key
    virtual_key="VIRTUAL_KEY"   # Add your provider's virtual key
)

speech_file_path = Path(__file__).parent / "speech.mp3"

response = portkey.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
)

f = open(speech_file_path, "wb")
f.write(response.content)
f.close()

On completion, the request will get logged in the logs UI and show the cost and latency incurred.

Supported Providers and Models

The following providers are supported for text-to-speech with more providers getting added soon. Please raise a request or a PR to add model or provider to the AI gateway.

Provider Models

Provider	Models
OpenAI	`tts-1 tts-1-hd`
Azure OpenAI	`tts-1 tts-1-hd`
Deepgram (Coming Soon)
ElevanLabs (Coming Soon)

OpenAI

tts-1 tts-1-hd

Azure OpenAI

tts-1 tts-1-hd

Deepgram (Coming Soon)

ElevanLabs (Coming Soon)

PreviousSpeech-to-Text NextCache (Simple & Semantic)

Last updated 16 days ago