Text-to-Speech

Portkey's AI gateway currently supports text-to-speech models like tts-1 and tts-1-hd by OpenAI.

Usage

We follow the OpenAI signature where you can send the input text and the voice option as a part of the API request. All the output formats mp3, opus, aac, flac, and pcm are supported. Portkey also supports real time audio streaming for TTS models.

Here's an example:

import fs from "fs";
import path from "path";
import OpenAI from "openai";
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'

const openai = new OpenAI({
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "OPENAI_VIRTUAL_KEY"
  })
});

const speechFile = path.resolve("./speech.mp3");

async function main() {
  const mp3 = await openai.audio.speech.create({
    model: "tts-1",
    voice: "alloy",
    input: "Today is a wonderful day to build something people love!",
  });
  const buffer = Buffer.from(await mp3.arrayBuffer());
  await fs.promises.writeFile(speechFile, buffer);
}

main();

from pathlib import Path
from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        api_key="PORTKEY_API_KEY",
        virtual_key="OPENAI_VIRTUAL_KEY"
    )
)

speech_file_path = Path(__file__).parent / "speech.mp3"

response = client.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
)

f = open(speech_file_path, "wb")
f.write(response.content)
f.close()

curl "https://api.portkey.ai/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: $PORTKEY_API_KEY" \
  -H "x-portkey-provider: openai" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "tts-1",
    "input": "Today is a wonderful day to build something people love!",
    "voice": "alloy"
  }' \
  --output speech.mp3

On completion, the request will get logged in the logs UI and show the cost and latency incurred.

Supported Providers and Models

The following providers are supported for text-to-speech with more providers getting added soon. Please raise a request or a PR to add model or provider to the AI gateway.

Provider Models

Provider	Models
OpenAI	`tts-1 tts-1-hd`
Deepgram (Coming Soon)
ElevanLabs (Coming Soon)

OpenAI

tts-1 tts-1-hd

Deepgram (Coming Soon)

ElevanLabs (Coming Soon)

PreviousSpeech-to-Text NextCache (Simple & Semantic)

Last updated 3 months ago