RepoTogether AITogether AIpublished Mar 3, 2023seen 5d

togethercomputer/OpenChatKit

Python

Open original ↗

Captured source

source ↗
published Mar 3, 2023seen 5dcaptured 8hhttp 200method plain

togethercomputer/OpenChatKit

Language: Python

License: Apache-2.0

Stars: 8983

Forks: 1002

Open issues: 93

Created: 2023-03-03T00:12:53Z

Pushed: 2024-04-09T19:09:58Z

Default branch: main

Fork: no

Archived: no

README:

OpenChatKit

OpenChatKit provides a powerful, open-source base to create both specialized and general purpose models for various applications. The kit includes an instruction-tuned language models, a moderation model, and an extensible retrieval system for including up-to-date responses from custom repositories. OpenChatKit models were trained on the OIG-43M training dataset, which was a collaboration between Together, LAION, and Ontocord.ai.

In this repo, you'll find code for:

  • Training GPT-NeoXT-Chat-Base-20B, a 20B parameter chat model (see [docs/GPT-NeoXT-Chat-Base-20B.md](docs/GPT-NeoXT-Chat-Base-20B.md))
  • Fine-tuning Llama-2-7B-32K-beta, a 7B parameter long context model
  • Training Pythia-Chat-Base-7B, a 7B parameter chat model
  • Testing inference using either of the chat models
  • Augmenting the model with additional context from a retrieval index

Contents

  • [Getting Started](#getting-started)
  • [Requirements](#requirements)
  • [Chatting with Pythia-Chat-Base-7B](#chatting-with-pythia-chat-base-7b)
  • [Fine-tuning Llama-2-7B-32K-beta](#fine-tuning-llama-2-7b-32k-beta)
  • [Downloading and converting the base model](#downloading-and-converting-the-base-model)
  • [Fine-tuning the model](#fine-tuning-the-model)
  • [Converting trained weights to Hugging Face format](#converting-trained-weights-to-hugging-face-format)
  • [Reproducing Pythia-Chat-Base-7B](#reproducing-pythia-chat-base-7b)
  • [Downloading training data and the base model](#downloading-training-data-and-the-base-model)
  • [(Optional) 8bit Adam](#optional-8bit-adam)
  • [Training the model](#training-the-model)
  • [Converting weights to Hugging Face format](#converting-weights-to-hugging-face-format)
  • [Testing the new model](#testing-the-new-model)
  • [Monitoring](#monitoring)
  • [Loguru](#loguru)
  • [Weights & Biases](#weights--biases)
  • [Experimental: Retrieval-Augmented Models](#experimental-retrieval-augmented-models)
  • [See Also](#see-also)
  • [License](#license)
  • [Citing OpenChatKit](#citing-openchatkit)
  • [Acknowledgements](#acknowledgements)

Getting Started

In this tutorial, you will download Pythia-Chat-Base-7B, an instruction-tuned language model, and run some some inference requests against it using a command-line tool.

Pythia-Chat-Base-7B is a 7B-parameter fine-tuned variant of Pythia-6.9B-deduped from Eleuther AI. Pre-trained weights for this model are available on Hugging Face as togethercomputer/Pythia-Chat-Base-7B under an Apache 2.0 license.

More details can be found on the model card for Pythia-Chat-Base-7B on Hugging Face.

Requirements

Before you begin, you need to install PyTorch and other dependencies.

1. Install Miniconda from their website.

2. Install Git LFS from their website.

3. Install the git lfs hooks.

git lfs install

4. Install mamba in the base environment so it's available in all environments.

conda install mamba -n base -c conda-forge

5. Create an environment called OpenChatKit using the environment.yml file at the root of this repo.

> Note > Use mamba to create the environment. It's much faster than using conda.

mamba env create -f environment.yml

6. Activate the new conda environment.

conda activate OpenChatKit

Chatting with Pythia-Chat-Base-7B

To help you try the model, [inference/bot.py](inference/bot.py) is a simple command-line test harness that provides a shell inferface enabling you to chat with the model. Simply enter text at the prompt and the model replies. The test harness also maintains conversation history to provide the model with context.

Start the bot by calling bot.py from the root for the repo.

python inference/bot.py --model togethercomputer/Pythia-Chat-Base-7B

Loading the model can take some time, but once it's loaded, you are greeted with a prompt. Say hello.

$ python inference/bot.py
Loading /home/csris/src/github.com/togethercomputer/OpenChatKit/inference/../huggingface_models/GPT-NeoXT-Chat-Base-20B to cuda:1...
Welcome to OpenChatKit shell. Type /help or /? to list commands.

>>> Hello.
Hello human.

>>>

Enter additional queries at the prompt, and the model replies. Under the covers, the shell is forming a prompt with all previous queries and passes that to the model to generate more text.

The shell also supports additional commands to inspect hyperparamters, the full prompt, and more. Commands are prefixed with a /.

> Note > The /quit command exits the shell.

Please see [the inference README](inference/README.md) for more details about arguments, running on multiple/specific GPUs, and running on consumer hardware.

Fine-tuning Llama-2-7B-32K-beta

Llama-2-7B-32K-beta model can be fine-tuned using various datasets. In this tutorial, we will use the multi-document natural questions dataset and BookSum dataset.

Downloading and converting the base model

To download model Llama-2-7B-32K-beta and prepare it for fine-tuning, run this command from the root of the repository.

python pretrained/Llama-2-7B-32K-beta/prepare.py

The weights for this model will be in the pretrained/Llama-2-7B-32K-beta/togethercomputer_Llama-2-7B-32K-beta directory.

Fine-tuning the model

The training/finetune_llama-2-7b-32k-mqa.sh and training/finetune_llama-2-7b-32k-booksum.sh scripts configure and run the training loop.

1. To fine-tune the multi-document natural questions dataset, run:

bash training/finetune_llama-2-7b-32k-mqa.sh

2. To fine-tune the BookSum dataset, run:

bash training/finetune_llama-2-7b-32k-booksum.sh

As the training loop runs, checkpoints are saved to the model_ckpts directory at the root of the repo.

Please see [the training README](training/README.md) for more details about customizing the training run.

Converting trained weights to Hugging Face format

Before you can use this model to perform inference, it must be converted to the Hugging Face format. Run this command…

Excerpt shown — open the source for the full document.