ForkArcee AIArcee AIpublished Aug 19, 2024seen 5d

arcee-ai/chat-ui

forked from huggingface/chat-ui

Open original ↗

Captured source

source ↗
published Aug 19, 2024seen 5dcaptured 9hhttp 200method plain

arcee-ai/chat-ui

License: Apache-2.0

Stars: 1

Forks: 0

Open issues: 1

Created: 2024-08-19T14:57:12Z

Pushed: 2024-08-30T17:41:55Z

Default branch: main

Fork: yes

Parent repository: huggingface/chat-ui

Archived: no

README: --- title: chat-ui emoji: 🔥 colorFrom: purple colorTo: purple sdk: docker pinned: false license: apache-2.0 base_path: /chat app_port: 3000 failure_strategy: rollback load_balancing_strategy: random ---

Chat UI

Find the docs at [hf.co/docs/chat-ui](https://huggingface.co/docs/chat-ui/index).

!Chat UI repository thumbnail

A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

0. [Quickstart](#quickstart) 1. [No Setup Deploy](#no-setup-deploy) 2. [Setup](#setup) 3. [Launch](#launch) 4. [Web Search](#web-search) 5. [Text Embedding Models](#text-embedding-models) 6. [Extra parameters](#extra-parameters) 7. [Common issues](#common-issues) 8. [Deploying to a HF Space](#deploying-to-a-hf-space) 9. [Building](#building)

Quickstart

You can quickly start a locally running chat-ui & LLM text-generation server thanks to chat-ui's llama.cpp server support.

Step 1 (Start llama.cpp server):

Install llama.cpp w/ brew (for Mac):

# install llama.cpp
brew install llama.cpp

or build directly from the source for your target device:

git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make

Next, start the server with the LLM of your choice:

# start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example)
llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096

A local LLaMA.cpp HTTP Server will start on http://localhost:8080. Read more here.

Step 2 (tell chat-ui to use local llama.cpp server):

Add the following to your .env.local:

MODELS=`[
{
"name": "Local microsoft/Phi-3-mini-4k-instruct-gguf",
"tokenizer": "microsoft/Phi-3-mini-4k-instruct-gguf",
"preprompt": "",
"chatPromptTemplate": "{{preprompt}}{{#each messages}}{{#ifUser}}\n{{content}}\n\n{{/ifUser}}{{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
"parameters": {
"stop": ["", "", ""],
"temperature": 0.7,
"max_new_tokens": 1024,
"truncate": 3071
},
"endpoints": [{
"type" : "llamacpp",
"baseURL": "http://localhost:8080"
}],
},
]`

Read more here.

Step 3 (make sure you have MongoDb running locally):

docker run -d -p 27017:27017 --name mongo-chatui mongo:latest

Read more [here](#database).

Step 4 (start chat-ui):

git clone https://github.com/huggingface/chat-ui
cd chat-ui
npm install
npm run dev -- --open

Read more [here](#launch).

No Setup Deploy

If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.

You can deploy your own customized Chat UI instance with any supported LLM of your choice on Hugging Face Spaces. To do so, use the chat-ui template available here.

Set HF_TOKEN in Space secrets to deploy a model with gated access or a model in a private repository. It's also compatible with Inference for PROs curated list of powerful models with higher rate limits. Make sure to create your personal token first in your User Access Tokens settings.

Read the full tutorial here.

Setup

The default config for Chat UI is stored in the .env file. You will need to override some values to get Chat UI to run locally. This is done in .env.local.

Start by creating a .env.local file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:

MONGODB_URL=
HF_TOKEN=

Database

The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.

You can use a local MongoDB instance. The easiest way is to spin one up using docker:

docker run -d -p 27017:27017 --name mongo-chatui mongo:latest

In which case the url of your DB will be MONGODB_URL=mongodb://localhost:27017.

Alternatively, you can use a free MongoDB Atlas instance for this, Chat UI should fit comfortably within their free tier. After which you can set the MONGODB_URL variable in .env.local to match your instance.

Hugging Face Access Token

If you use a remote inference endpoint, you will need a Hugging Face access token to run Chat UI locally. You can get one from your Hugging Face profile.

Launch

After you're done with the .env.local file you can run Chat UI locally with:

npm install
npm run dev

Web Search

Chat UI features a powerful Web Search feature. It works by:

1. Generating an appropriate search query from the user prompt. 2. Performing web search and extracting content from webpages. 3. Creating embeddings from texts using a text embedding model. 4. From these embeddings, find the ones that are closest to the user query using a vector similarity search. Specifically, we use inner product distance. 5. Get the corresponding texts to those closest embeddings and perform Retrieval-Augmented Generation (i.e. expand user prompt by adding those texts so that an LLM can use this information).

Text Embedding Models

By default (for backward compatibility), when TEXT_EMBEDDING_MODELS environment variable is not defined,…

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Routine fork with minimal traction