arcee-ai/chat-ui
forked from huggingface/chat-ui
Captured source
source ↗arcee-ai/chat-ui
License: Apache-2.0
Stars: 1
Forks: 0
Open issues: 1
Created: 2024-08-19T14:57:12Z
Pushed: 2024-08-30T17:41:55Z
Default branch: main
Fork: yes
Parent repository: huggingface/chat-ui
Archived: no
README: --- title: chat-ui emoji: 🔥 colorFrom: purple colorTo: purple sdk: docker pinned: false license: apache-2.0 base_path: /chat app_port: 3000 failure_strategy: rollback load_balancing_strategy: random ---
Chat UI
Find the docs at [hf.co/docs/chat-ui](https://huggingface.co/docs/chat-ui/index).
A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.
0. [Quickstart](#quickstart) 1. [No Setup Deploy](#no-setup-deploy) 2. [Setup](#setup) 3. [Launch](#launch) 4. [Web Search](#web-search) 5. [Text Embedding Models](#text-embedding-models) 6. [Extra parameters](#extra-parameters) 7. [Common issues](#common-issues) 8. [Deploying to a HF Space](#deploying-to-a-hf-space) 9. [Building](#building)
Quickstart
You can quickly start a locally running chat-ui & LLM text-generation server thanks to chat-ui's llama.cpp server support.
Step 1 (Start llama.cpp server):
Install llama.cpp w/ brew (for Mac):
# install llama.cpp brew install llama.cpp
or build directly from the source for your target device:
git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make
Next, start the server with the LLM of your choice:
# start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example) llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096
A local LLaMA.cpp HTTP Server will start on http://localhost:8080. Read more here.
Step 2 (tell chat-ui to use local llama.cpp server):
Add the following to your .env.local:
MODELS=`[
{
"name": "Local microsoft/Phi-3-mini-4k-instruct-gguf",
"tokenizer": "microsoft/Phi-3-mini-4k-instruct-gguf",
"preprompt": "",
"chatPromptTemplate": "{{preprompt}}{{#each messages}}{{#ifUser}}\n{{content}}\n\n{{/ifUser}}{{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
"parameters": {
"stop": ["", "", ""],
"temperature": 0.7,
"max_new_tokens": 1024,
"truncate": 3071
},
"endpoints": [{
"type" : "llamacpp",
"baseURL": "http://localhost:8080"
}],
},
]`Read more here.
Step 3 (make sure you have MongoDb running locally):
docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
Read more [here](#database).
Step 4 (start chat-ui):
git clone https://github.com/huggingface/chat-ui cd chat-ui npm install npm run dev -- --open
Read more [here](#launch).
No Setup Deploy
If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.
You can deploy your own customized Chat UI instance with any supported LLM of your choice on Hugging Face Spaces. To do so, use the chat-ui template available here.
Set HF_TOKEN in Space secrets to deploy a model with gated access or a model in a private repository. It's also compatible with Inference for PROs curated list of powerful models with higher rate limits. Make sure to create your personal token first in your User Access Tokens settings.
Read the full tutorial here.
Setup
The default config for Chat UI is stored in the .env file. You will need to override some values to get Chat UI to run locally. This is done in .env.local.
Start by creating a .env.local file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
MONGODB_URL= HF_TOKEN=
Database
The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.
You can use a local MongoDB instance. The easiest way is to spin one up using docker:
docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
In which case the url of your DB will be MONGODB_URL=mongodb://localhost:27017.
Alternatively, you can use a free MongoDB Atlas instance for this, Chat UI should fit comfortably within their free tier. After which you can set the MONGODB_URL variable in .env.local to match your instance.
Hugging Face Access Token
If you use a remote inference endpoint, you will need a Hugging Face access token to run Chat UI locally. You can get one from your Hugging Face profile.
Launch
After you're done with the .env.local file you can run Chat UI locally with:
npm install npm run dev
Web Search
Chat UI features a powerful Web Search feature. It works by:
1. Generating an appropriate search query from the user prompt. 2. Performing web search and extracting content from webpages. 3. Creating embeddings from texts using a text embedding model. 4. From these embeddings, find the ones that are closest to the user query using a vector similarity search. Specifically, we use inner product distance. 5. Get the corresponding texts to those closest embeddings and perform Retrieval-Augmented Generation (i.e. expand user prompt by adding those texts so that an LLM can use this information).
Text Embedding Models
By default (for backward compatibility), when TEXT_EMBEDDING_MODELS environment variable is not defined,…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Routine fork with minimal traction