Streaming Speech with ElevenLabs

Generate and stream speech through Supabase Edge Functions. Store speech in Supabase Storage and cache responses via built-in CDN.

Introduction

In this tutorial you will learn how to build an edge API to generate, stream, store, and cache speech using Supabase Edge Functions, Supabase Storage, and ElevenLabs text to speech API.

Find the example project on GitHub.

Requirements

An ElevenLabs account with an API key.
A Supabase account (you can sign up for a free account via database.new).
The Supabase CLI installed on your machine.
The Deno runtime installed on your machine and optionally setup in your favourite IDE.

Setup

Create a Supabase project locally

After installing the Supabase CLI, run the following command to create a new Supabase project locally:

1
supabase init

Configure the storage bucket

You can configure the Supabase CLI to automatically generate a storage bucket by adding this configuration in the config.toml file:

1
2
3
4
5
[storage.buckets.audio]public = falsefile_size_limit = "50MiB"allowed_mime_types = ["audio/mp3"]objects_path = "./audio"

Upon running supabase start this will create a new storage bucket in your local Supabase project. Should you want to push this to your hosted Supabase project, you can run supabase seed buckets --linked.

Configure background tasks for Supabase Edge Functions

To use background tasks in Supabase Edge Functions when developing locally, you need to add the following configuration in the config.toml file:

1
2
[edge_runtime]policy = "per_worker"

When running with per_worker policy, Function won't auto-reload on edits. You will need to manually restart it by running supabase functions serve.

Create a Supabase Edge Function for speech generation

Create a new Edge Function by running the following command:

1
supabase functions new text-to-speech

If you're using VS Code or Cursor, select y when the CLI prompts "Generate VS Code settings for Deno? [y/N]"!

Set up the environment variables

Within the supabase/functions directory, create a new .env file and add the following variables:

1
2
# Find / create an API key at https://elevenlabs.io/app/settings/api-keysELEVENLABS_API_KEY=your_api_key

Dependencies

The project uses a couple of dependencies:

The @supabase/supabase-js library to interact with the Supabase database.
The ElevenLabs JavaScript SDK to interact with the text-to-speech API.
The open-source object-hash to generate a hash from the request parameters.

Since Supabase Edge Function uses the Deno runtime, you don't need to install the dependencies, rather you can import them via the npm: prefix.

Code the Supabase Edge Function

In your newly created supabase/functions/text-to-speech/index.ts file, add the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
// Setup type definitions for built-in Supabase Runtime APIsimport 'jsr:@supabase/functions-js/edge-runtime.d.ts'import { createClient } from 'npm:@supabase/supabase-js@2'import { ElevenLabsClient } from 'npm:elevenlabs@1.52.0'import * as hash from 'npm:object-hash'const supabase = createClient(  Deno.env.get('SUPABASE_URL')!,  Deno.env.get('SUPABASE_SERVICE_ROLE_KEY')!)const client = new ElevenLabsClient({  apiKey: Deno.env.get('ELEVENLABS_API_KEY'),})// Upload audio to Supabase Storage in a background taskasync function uploadAudioToStorage(stream: ReadableStream, requestHash: string) {  const { data, error } = await supabase.storage    .from('audio')    .upload(`${requestHash}.mp3`, stream, {      contentType: 'audio/mp3',    })  console.log('Storage upload result', { data, error })}Deno.serve(async (req) => {  // To secure your function for production, you can for example validate the request origin,  // or append a user access token and validate it with Supabase Auth.  console.log('Request origin', req.headers.get('host'))  const url = new URL(req.url)  const params = new URLSearchParams(url.search)  const text = params.get('text')  const voiceId = params.get('voiceId') ?? 'JBFqnCBsd6RMkjVDRZzb'  const requestHash = hash.MD5({ text, voiceId })  console.log('Request hash', requestHash)  // Check storage for existing audio file  const { data } = await supabase.storage.from('audio').createSignedUrl(`${requestHash}.mp3`, 60)  if (data) {    console.log('Audio file found in storage', data)    const storageRes = await fetch(data.signedUrl)    if (storageRes.ok) return storageRes  }  if (!text) {    return new Response(JSON.stringify({ error: 'Text parameter is required' }), {      status: 400,      headers: { 'Content-Type': 'application/json' },    })  }  try {    console.log('ElevenLabs API call')    const response = await client.textToSpeech.convertAsStream(voiceId, {      output_format: 'mp3_44100_128',      model_id: 'eleven_multilingual_v2',      text,    })    const stream = new ReadableStream({      async start(controller) {        for await (const chunk of response) {          controller.enqueue(chunk)        }        controller.close()      },    })    // Branch stream to Supabase Storage    const [browserStream, storageStream] = stream.tee()    // Upload to Supabase Storage in the background    EdgeRuntime.waitUntil(uploadAudioToStorage(storageStream, requestHash))    // Return the streaming response immediately    return new Response(browserStream, {      headers: {        'Content-Type': 'audio/mpeg',      },    })  } catch (error) {    console.log('error', { error })    return new Response(JSON.stringify({ error: error.message }), {      status: 500,      headers: { 'Content-Type': 'application/json' },    })  }})

Run locally

To run the function locally, run the following commands:

1
supabase start

Once the local Supabase stack is up and running, run the following command to start the function and observe the logs:

1
supabase functions serve

Try it out

Navigate to http://127.0.0.1:54321/functions/v1/text-to-speech?text=hello%20world to hear the function in action.

Afterwards, navigate to http://127.0.0.1:54323/project/default/storage/buckets/audio to see the audio file in your local Supabase Storage bucket.

Deploy to Supabase

If you haven't already, create a new Supabase account at database.new and link the local project to your Supabase account:

1
supabase link

Once done, run the following command to deploy the function:

1
supabase functions deploy

Set the function secrets

Now that you have all your secrets set locally, you can run the following command to set the secrets in your Supabase project:

1
supabase secrets set --env-file supabase/functions/.env

Test the function

The function is designed in a way that it can be used directly as a source for an <audio> element.

1
2
3
4
<audio  src="https://${SUPABASE_PROJECT_REF}.supabase.co/functions/v1/text-to-speech?text=Hello%2C%20world!&voiceId=JBFqnCBsd6RMkjVDRZzb"  controls/>

You can find an example frontend implementation in the complete code example on GitHub.

Streaming Speech with ElevenLabs

Generate and stream speech through Supabase Edge Functions. Store speech in Supabase Storage and cache responses via built-in CDN.

Introduction#

Requirements#

Setup#

Create a Supabase project locally#

Configure the storage bucket#

Configure background tasks for Supabase Edge Functions#

Create a Supabase Edge Function for speech generation#

Set up the environment variables#

Dependencies#

Code the Supabase Edge Function#

Run locally#

Try it out#

Deploy to Supabase#

Set the function secrets#

Test the function#

Is this helpful?