RepoTogether AITogether AIpublished Apr 5, 2023seen 5d

togethercomputer/together-python

Python

Open original ↗

Captured source

source ↗

togethercomputer/together-python

Description: The Official Python Client for Together's API

Language: Python

License: Apache-2.0

Stars: 81

Forks: 26

Open issues: 28

Created: 2023-04-05T19:19:13Z

Pushed: 2026-05-19T17:46:11Z

Default branch: main

Fork: no

Archived: no

README:

> [!NOTE] > ## 🚀 Together Python SDK 2.0 is now available! > > V1 is now considered deprecated and will be maintained in maintanence mode. All new features and development will occur in the 2.0 SDK. > > Check out the new SDK: [together-py](https://github.com/togethercomputer/together-py) > > 📖 Migration Guide: https://docs.together.ai/docs/pythonv2-migration-guide > > ### Upgrade > > Using uv (Recommended): > ``bash > uv sync --upgrade-package together > > > **Using pip:** > bash > pip install --upgrade together > `` >

Together V1

![Discord](https://discord.com/invite/9Rk6sSeWEG)

> Note: You are looking at the codebase for Together Python V1. The latest Together Python SDK can be found [here.](https://github.com/togethercomputer/together-py)

The Together Python API Library is the official Python client for Together's API platform, providing a convenient way for interacting with the REST APIs and enables easy integrations with Python 3.10+ applications with easy to use synchronous and asynchronous clients.

Installation

To install Together Python Library from PyPI, simply run:

pip install together

Setting up API Key

> 🚧 You will need to create an account with Together.ai to obtain a Together API Key.

Once logged in to the Together Playground, you can find available API keys in this settings page.

Setting environment variable

export TOGETHER_API_KEY=xxxxx

Using the client

from together import Together

client = Together(api_key="xxxxx")

This repo contains both a Python Library and a CLI. We'll demonstrate how to use both below.

Usage – Python Client

Chat Completions

from together import Together

client = Together()

# Simple text message
response = client.chat.completions.create(
model="meta-llama/Llama-4-Scout-17B-16E-Instruct",
messages=[{"role": "user", "content": "tell me about new york"}],
)
print(response.choices[0].message.content)

# Multi-modal message with text and image
response = client.chat.completions.create(
model="meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo",
messages=[{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
}
}
]
}]
)
print(response.choices[0].message.content)

# Multi-modal message with multiple images
response = client.chat.completions.create(
model="Qwen/Qwen2.5-VL-72B-Instruct",
messages=[{
"role": "user",
"content": [
{
"type": "text",
"text": "Compare these two images."
},
{
"type": "image_url",
"image_url": {
"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/slack.png"
}
}
]
}]
)
print(response.choices[0].message.content)

# Multi-modal message with text and video
response = client.chat.completions.create(
model="Qwen/Qwen2.5-VL-72B-Instruct",
messages=[{
"role": "user",
"content": [
{
"type": "text",
"text": "What's happening in this video?"
},
{
"type": "video_url",
"video_url": {
"url": "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerFun.mp4"
}
}
]
}]
)
print(response.choices[0].message.content)

The chat completions API supports three types of content:

  • Plain text messages using the content field directly
  • Multi-modal messages with images using type: "image_url"
  • Multi-modal messages with videos using type: "video_url"

When using multi-modal content, the content field becomes an array of content objects, each with its own type and corresponding data.

Streaming

import os
from together import Together

client = Together()
stream = client.chat.completions.create(
model="meta-llama/Llama-4-Scout-17B-16E-Instruct",
messages=[{"role": "user", "content": "tell me about new york"}],
stream=True,
)

for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)

Async usage

import asyncio
from together import AsyncTogether

async_client = AsyncTogether()
messages = [
"What are the top things to do in San Francisco?",
"What country is Paris in?",
]

async def async_chat_completion(messages):
async_client = AsyncTogether()
tasks = [
async_client.chat.completions.create(
model="meta-llama/Llama-4-Scout-17B-16E-Instruct",
messages=[{"role": "user", "content": message}],
)
for message in messages
]
responses = await asyncio.gather(*tasks)

for response in responses:
print(response.choices[0].message.content)

asyncio.run(async_chat_completion(messages))

Fetching logprobs

Logprobs are logarithms of token-level generation probabilities that indicate the likelihood of the generated token based on the previous tokens in the context. Logprobs allow us to estimate the model's confidence in its outputs, which can be used to decide how to optimally consume the model's output (e.g. rejecting low confidence outputs, retrying or ensembling model outputs etc).

from together import Together

client = Together()

response = client.chat.completions.create(
model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
messages=[{"role": "user", "content": "tell me about new york"}],
logprobs=1
)

response_lobprobs = response.choices[0].logprobs

print(dict(zip(response_lobprobs.tokens, response_lobprobs.token_logprobs)))
# {'New': -2.384e-07, ' York': 0.0, ',': 0.0, ' also': -0.20703125, ' known': -0.20214844, ' as': -8.34465e-07, ... }

More details about using logprobs in Together's API can be found here.

Completions

Completions are for code and language models shown here. Below, a code model example is shown.

from together import…

Excerpt shown — open the source for the full document.