ReleaseMistral AIMistral AIpublished May 24, 2024seen 5d

mistralai/mistral-inference v1.1.0

mistralai/mistral-inference

Open original ↗

Captured source

source ↗
published May 24, 2024seen 5dcaptured 8hhttp 200method plain

v1.1.0 Add LoRA

Repository: mistralai/mistral-inference

Tag: v1.1.0

Published: 2024-05-24T18:31:03Z

Prerelease: no

Release notes: mistral-inference==1.1.0 supports running LoRA models that were trained with: https://github.com/mistralai/mistral-finetune

Having trained a 7B base LoRA, you can run mistral-inference as follows:

from mistral_inference.model import Transformer
from mistral_inference.generate import generate

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest

MODEL_PATH = "path/to/downloaded/7B_base_dir"

tokenizer = MistralTokenizer.from_file(f"{MODEL_PATH}/tokenizer.model.v3") # change to extracted tokenizer file
model = Transformer.from_folder(MODEL_PATH) # change to extracted model dir
model.load_lora("/path/to/run_lora_dir/checkpoints/checkpoint_000300/consolidated/lora.safetensors")

completion_request = ChatCompletionRequest(messages=[UserMessage(content="Explain Machine Learning to me in a nutshell.")])

tokens = tokenizer.encode_chat_completion(completion_request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])

print(result)