What does this fork signal mean?

Together AI forked togethercomputer/gpt-neox (forked from EleutherAI/gpt-neox). This fork signal points to upstream code the lab may be inspecting, patching, or building on. High-signal details: repo togethercomputer/gpt-neox · parent EleutherAI/gpt-neox. onlylabs links this event to 1 captured evidence page and 6 related fork signals.

Together AI Fork: togethercomputer/gpt-neox

Captured source

source ↗

GitHub/github.com/togethercomputer/gpt-neox

togethercomputer/gpt-neox repository metadata

Source ↗

published Apr 24, 2023seen 5dcaptured 8hhttp 200method plain

togethercomputer/gpt-neox

Description: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Language: Python

License: Apache-2.0

Stars: 2

Forks: 0

Open issues: 0

Created: 2023-04-24T11:26:48Z

Pushed: 2023-04-24T16:59:51Z

Default branch: main

Fork: yes

Parent repository: EleutherAI/gpt-neox

Archived: no

README: [](https://wandb.ai/eleutherai/neox)

GPT-NeoX

This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training.

For those looking for a TPU-centric codebase, we recommend Mesh Transformer JAX.

If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face `transformers` library instead which supports GPT-NeoX models.

GPT-NeoX 2.0

Prior to 3/9/2023, GPT-NeoX relied on DeeperSpeed, which was based on an old version of DeepSpeed (0.3.15). In order to migrate to the latest upstream DeepSpeed version while allowing users to access the old versions of GPT-NeoX and DeeperSpeed, we have introduced two versioned releases for both libraries:

Version 1.0 of GPT-NeoX and DeeperSpeed maintain snapshots of the old stable versions that GPT-NeoX-20B and the Pythia Suite were trained on.
Version 2.0 of GPT-NeoX and DeeperSpeed are the latest versions built on the latest DeepSpeed, and will be maintained going forward.

[Quick Start](#quick-start)
[Environment and Dependencies](#environment-and-dependencies)
[Usage](#usage)
[Configuration](#configuration)
[Datasets](#datasets)
[Preconfigured Datasets](#preconfigured-datasets)
[Using Custom Data](#using-custom-data)
[Training and Finetuning](#training-and-finetuning)
[Select Pretrained Models](#pretrained-models)
[GPT-NeoX-20B](#gpt-neox-20b)
[Pythia](#pythia)
[Polyglot](#polyglot)
[Fill-in-the-Middle](#fill-in-the-middle)
[Inference](#inference)
[Evaluation](#evaluation)
[Exporting to Hugging Face](#exporting-to-hugging-face)
[Monitoring](#monitoring)
[Weights & Biases](#wandb)
[TensorBoard](#tensorboard)
[Administrative Notes](#administrative-notes)
[Citing GPT-NeoX](#citing-gpt-neox)
[Licensing](#licensing)
[Publications](#publications)
[Acknowledgements](#acknowledgements)

Quick Start

Environment and Dependencies

Host Setup

First make sure you are in an environment with Python 3.8 with an appropriate version of PyTorch 1.8 or later installed. Note: Some of the libraries that GPT-NeoX depends on have not been updated to be compatible with Python 3.10+. Python 3.9 appears to work, but this codebase has been developed and tested for Python 3.8.

To install the remaining basic dependencies, run:

pip install -r requirements/requirements.txt
python ./megatron/fused_kernels/setup.py install # optional if not using fused kernels

from the repository root.

Warning: Our codebase relies on DeeperSpeed, our fork of the DeepSpeed library with some added changes. We strongly recommend using Anaconda, a virtual machine, or some other form of environment isolation before continuing. Failure to do so may cause other repositories that rely on DeepSpeed to break.

TensorBoard

=======

Flash Attention

To use Flash-Attention, install the additional dependencies in ./requirements/requirements-flashattention.txt and set the attention type in your configuration accordingly (see [configs](./configs/)). This can provide significant speed-ups over regular attention on certain GPU architectures, including Ampere GPUs (such as A100s); see the repository for more details.

Containerized Setup

We also provide a Dockerfile if you prefer to run NeoX in a container. To use this option, first build an image named gpt-neox from the repository root directory with docker build -t gpt-neox -f Dockerfile .. We also host pre-built images on Docker Hub at `leogao2/gpt-neox`.

You can then run a container based on this image. For instance, the below snippet mounts the cloned repository (gpt-neox) directory to /gpt-neox in the container and uses nvidia-docker to make four GPUs (numbers 0-3) accessible to the container. As noted by the NCCL documentation, both --shm-size=1g and --ulimit memlock=-1 are important to prevent Docker from allocating too little shared memory.

nvidia-docker run --rm -it -e NVIDIA_VISIBLE_DEVICES=0,1,2,3 --shm-size=1g --ulimit memlock=-1 --mount type=bind,src=$PWD,dst=/gpt-neox gpt-neox

Usage

All functionality (inference included), should be launched using deepy.py, a wrapper around the deepspeed launcher.

We currently offer three main functions: 1. train.py is used for training and finetuning models. 2. evaluate.py is used to evaluate a trained model using the language model evaluation harness. 3. generate.py is used to sample text from a trained model.

which can be launched with:

./deepy.py [script.py] [./path/to/config_1.yml] [./path/to/config_2.yml] ... [./path/to/config_n.yml]

E.G To generate text unconditionally with the GPT-NeoX-20B model, you can use the following:

./deepy.py…

Excerpt shown — open the source for the full document.