RepoOpenAIOpenAIpublished Sep 30, 2024seen 6d

openai/openai-realtime-api-beta

JavaScript

Open original ↗

Captured source

source ↗
published Sep 30, 2024seen 6dcaptured 8hhttp 200method plain

openai/openai-realtime-api-beta

Description: Node.js + JavaScript reference client for the Realtime API (beta)

Language: JavaScript

License: MIT

Stars: 1016

Forks: 293

Open issues: 92

Created: 2024-09-30T21:11:38Z

Pushed: 2024-11-07T22:51:02Z

Default branch: main

Fork: no

Archived: no

README:

Reference Client: Realtime API (beta)

This repository contains a reference client aka sample library for connecting to OpenAI's Realtime API. This library is in beta and should not be treated as a final implementation. You can use it to easily prototype conversational apps.

The easiest way to get playing with the API right away is to use the **Realtime Console**, it uses the reference client to deliver a fully-functional API inspector with examples of voice visualization and more.

Quickstart

This library is built to be used both server-side (Node.js) and in browser (React, Vue), in both JavaScript and TypeScript codebases. While in beta, to install the library you will need to npm install directly from the GitHub repository.

$ npm i openai/openai-realtime-api-beta --save
import { RealtimeClient } from '@openai/realtime-api-beta';

const client = new RealtimeClient({ apiKey: process.env.OPENAI_API_KEY });

// Can set parameters ahead of connecting, either separately or all at once
client.updateSession({ instructions: 'You are a great, upbeat friend.' });
client.updateSession({ voice: 'alloy' });
client.updateSession({
turn_detection: { type: 'none' }, // or 'server_vad'
input_audio_transcription: { model: 'whisper-1' },
});

// Set up event handling
client.on('conversation.updated', (event) => {
const { item, delta } = event;
const items = client.conversation.getItems();
/**
* item is the current item being updated
* delta can be null or populated
* you can fetch a full list of items at any time
*/
});

// Connect to Realtime API
await client.connect();

// Send a item and triggers a generation
client.sendUserMessageContent([{ type: 'input_text', text: `How are you?` }]);

Browser (front-end) quickstart

You can use this client directly from the browser in e.g. React or Vue apps. We do not recommend this, your API keys are at risk if you connect to OpenAI directly from the browser. In order to instantiate the client in a browser environment, use:

import { RealtimeClient } from '@openai/realtime-api-beta';

const client = new RealtimeClient({
apiKey: process.env.OPENAI_API_KEY,
dangerouslyAllowAPIKeyInBrowser: true,
});

If you are running your own relay server, e.g. with the Realtime Console, you can instead connect to the relay server URL like so:

const client = new RealtimeClient({ url: RELAY_SERVER_URL });

Table of contents

1. [Project structure](#project-structure) 1. [Using the reference client](#using-the-reference-client) 1. [Sending messages](#sending-messages) 1. [Sending streaming audio](#sending-streaming-audio) 1. [Adding and using tools](#adding-and-using-tools) 1. [Manually using tools](#manually-using-tools) 1. [Interrupting the model](#interrupting-the-model) 1. [Client events](#client-events) 1. [Reference client utility events](#reference-client-utility-events) 1. [Server events](#server-events) 1. [Running tests](#running-tests) 1. [Acknowledgements and contact](#acknowledgements-and-contact)

Project structure

In this library, there are three primitives for interfacing with the Realtime API. We recommend starting with the RealtimeClient, but more advanced users may be more comfortable working closer to the metal.

1. [RealtimeClient](./lib/client.js)

  • Primary abstraction for interfacing with the Realtime API
  • Enables rapid application development with a simplified control flow
  • Has custom conversation.updated, conversation.item.appended, conversation.item.completed, conversation.interrupted and realtime.event events
  • These events send item deltas and conversation history

1. [RealtimeAPI](./lib/api.js)

  • Exists on client instance as client.realtime
  • Thin wrapper over WebSocket
  • Use this for connecting to the API, authenticating, and sending items
  • There is no item validation, you will have to rely on the API specification directly
  • Dispatches events as server.{event_name} and client.{event_name}, respectively

1. [RealtimeConversation](./lib/conversation.js)

  • Exists on client instance as client.conversation
  • Stores a client-side cache of your current conversation
  • Has event validation, will validate incoming events to make sure it can cache them properly

Using the reference client

The client comes packaged with some basic utilities that make it easy to build realtime apps quickly.

Sending messages

Sending messages to the server from the user is easy.

client.sendUserMessageContent([{ type: 'input_text', text: `How are you?` }]);
// or (empty audio)
client.sendUserMessageContent([
{ type: 'input_audio', audio: new Int16Array(0) },
]);

Sending streaming audio

To send streaming audio, use the .appendInputAudio() method. If you're in turn_detection: 'disabled' mode, then you need to use .createResponse() to tell the model to respond.

// Send user audio, must be Int16Array or ArrayBuffer
// Default audio format is pcm16 with sample rate of 24,000 Hz
// This populates 1s of noise in 0.1s chunks
for (let i = 0; i {
const result = await fetch(
`https://api.open-meteo.com/v1/forecast?latitude=${lat}&longitude=${lng}&current=temperature_2m,wind_speed_10m`,
);
const json = await result.json();
return json;
},
);

Manually using tools

The .addTool() method automatically runs a tool handler and triggers a response on handler completion. Sometimes you may not want that, for example: using tools to generate a schema that you use for other purposes.

In this case, we can use the tools item with updateSession. In this case you must specify type: 'function', which is not required for .addTool().

Note: Tools added with .addTool() will not be overridden when updating sessions manually like this, but every updateSession() change will override previous updateSession() changes. Tools added via .addTool() are persisted and appended to anything set manually here.

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

New beta API from OpenAI, solid traction.