openai/openai-realtime-api-beta
JavaScript
Captured source
source ↗openai/openai-realtime-api-beta
Description: Node.js + JavaScript reference client for the Realtime API (beta)
Language: JavaScript
License: MIT
Stars: 1016
Forks: 293
Open issues: 92
Created: 2024-09-30T21:11:38Z
Pushed: 2024-11-07T22:51:02Z
Default branch: main
Fork: no
Archived: no
README:
Reference Client: Realtime API (beta)
This repository contains a reference client aka sample library for connecting to OpenAI's Realtime API. This library is in beta and should not be treated as a final implementation. You can use it to easily prototype conversational apps.
The easiest way to get playing with the API right away is to use the **Realtime Console**, it uses the reference client to deliver a fully-functional API inspector with examples of voice visualization and more.
Quickstart
This library is built to be used both server-side (Node.js) and in browser (React, Vue), in both JavaScript and TypeScript codebases. While in beta, to install the library you will need to npm install directly from the GitHub repository.
$ npm i openai/openai-realtime-api-beta --save
import { RealtimeClient } from '@openai/realtime-api-beta';
const client = new RealtimeClient({ apiKey: process.env.OPENAI_API_KEY });
// Can set parameters ahead of connecting, either separately or all at once
client.updateSession({ instructions: 'You are a great, upbeat friend.' });
client.updateSession({ voice: 'alloy' });
client.updateSession({
turn_detection: { type: 'none' }, // or 'server_vad'
input_audio_transcription: { model: 'whisper-1' },
});
// Set up event handling
client.on('conversation.updated', (event) => {
const { item, delta } = event;
const items = client.conversation.getItems();
/**
* item is the current item being updated
* delta can be null or populated
* you can fetch a full list of items at any time
*/
});
// Connect to Realtime API
await client.connect();
// Send a item and triggers a generation
client.sendUserMessageContent([{ type: 'input_text', text: `How are you?` }]);Browser (front-end) quickstart
You can use this client directly from the browser in e.g. React or Vue apps. We do not recommend this, your API keys are at risk if you connect to OpenAI directly from the browser. In order to instantiate the client in a browser environment, use:
import { RealtimeClient } from '@openai/realtime-api-beta';
const client = new RealtimeClient({
apiKey: process.env.OPENAI_API_KEY,
dangerouslyAllowAPIKeyInBrowser: true,
});If you are running your own relay server, e.g. with the Realtime Console, you can instead connect to the relay server URL like so:
const client = new RealtimeClient({ url: RELAY_SERVER_URL });Table of contents
1. [Project structure](#project-structure) 1. [Using the reference client](#using-the-reference-client) 1. [Sending messages](#sending-messages) 1. [Sending streaming audio](#sending-streaming-audio) 1. [Adding and using tools](#adding-and-using-tools) 1. [Manually using tools](#manually-using-tools) 1. [Interrupting the model](#interrupting-the-model) 1. [Client events](#client-events) 1. [Reference client utility events](#reference-client-utility-events) 1. [Server events](#server-events) 1. [Running tests](#running-tests) 1. [Acknowledgements and contact](#acknowledgements-and-contact)
Project structure
In this library, there are three primitives for interfacing with the Realtime API. We recommend starting with the RealtimeClient, but more advanced users may be more comfortable working closer to the metal.
1. [RealtimeClient](./lib/client.js)
- Primary abstraction for interfacing with the Realtime API
- Enables rapid application development with a simplified control flow
- Has custom
conversation.updated,conversation.item.appended,conversation.item.completed,conversation.interruptedandrealtime.eventevents - These events send item deltas and conversation history
1. [RealtimeAPI](./lib/api.js)
- Exists on client instance as
client.realtime - Thin wrapper over WebSocket
- Use this for connecting to the API, authenticating, and sending items
- There is no item validation, you will have to rely on the API specification directly
- Dispatches events as
server.{event_name}andclient.{event_name}, respectively
1. [RealtimeConversation](./lib/conversation.js)
- Exists on client instance as
client.conversation - Stores a client-side cache of your current conversation
- Has event validation, will validate incoming events to make sure it can cache them properly
Using the reference client
The client comes packaged with some basic utilities that make it easy to build realtime apps quickly.
Sending messages
Sending messages to the server from the user is easy.
client.sendUserMessageContent([{ type: 'input_text', text: `How are you?` }]);
// or (empty audio)
client.sendUserMessageContent([
{ type: 'input_audio', audio: new Int16Array(0) },
]);Sending streaming audio
To send streaming audio, use the .appendInputAudio() method. If you're in turn_detection: 'disabled' mode, then you need to use .createResponse() to tell the model to respond.
// Send user audio, must be Int16Array or ArrayBuffer
// Default audio format is pcm16 with sample rate of 24,000 Hz
// This populates 1s of noise in 0.1s chunks
for (let i = 0; i {
const result = await fetch(
`https://api.open-meteo.com/v1/forecast?latitude=${lat}&longitude=${lng}¤t=temperature_2m,wind_speed_10m`,
);
const json = await result.json();
return json;
},
);Manually using tools
The .addTool() method automatically runs a tool handler and triggers a response on handler completion. Sometimes you may not want that, for example: using tools to generate a schema that you use for other purposes.
In this case, we can use the tools item with updateSession. In this case you must specify type: 'function', which is not required for .addTool().
Note: Tools added with .addTool() will not be overridden when updating sessions manually like this, but every updateSession() change will override previous updateSession() changes. Tools added via .addTool() are persisted and appended to anything set manually here.
Excerpt shown — open the source for the full document.
Notability
notability 7.0/10New beta API from OpenAI, solid traction.