What does this repo signal mean?

Mistral AI published mistralai/mistral-finetune (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo mistralai/mistral-finetune · language Python · High-traction fine-tuning tool from Mistral.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Mistral AI Repo: mistralai/mistral-finetune

Captured source

source ↗

GitHub/github.com/mistralai/mistral-finetune

mistralai/mistral-finetune repository metadata

Source ↗

published May 24, 2024seen Jun 5captured Jun 11http 200method plain

mistralai/mistral-finetune

Language: Python

License: Apache-2.0

Stars: 3091

Forks: 318

Open issues: 49

Created: 2024-05-24T18:19:28Z

Pushed: 2025-11-21T10:27:03Z

Default branch: main

Fork: no

Archived: no

README:

Mistral-finetune

mistral-finetune is a light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% of additional weights in the form of low-rank matrix perturbations are trained.

For maximum efficiency it is recommended to use an A100 or H100 GPU. The codebase is optimized for multi-GPU-single-node training setups, but for smaller models, such as the 7B a single GPU suffices.

> Note > > - The goal of this repository is to provide a simple, guided entrypoint to finetune Mistral models. > As such, it is fairly opinionated (especially around data formatting) and does not aim at being exhaustive > across multiple model architectures or hardware types. > For more generic approaches, you can check out some other great projects like > torchtune.

News

13.08.2024: Mistral Large v2 is now compatible with mistral-finetune!
1. Download the 123B Instruct [here](##model-download) and set model_id_or_path to the downloaded checkpoint dir.
2. Fine-tuning Mistral-Large v2 requires significantly more memory due to a larger model size. For now set seq_len to =1.3.1`).
3. Fine-tuning Mistral-Nemo requires currently much more memory due to a larger vocabulary size which spikes the peak memory requirement of the CE loss (we'll soon add an improved CE loss here). For now set seq_len to :. E.g.: data.instruct_data: "/path/to/data1.jsonl:5.,/path/to/data2.jsonl:1.,/path/to/dir_of_jsonls:1."`
data.data is an optional path to additional pretraining data in the format as explained above. Note that this field can be left blank.
data.eval_instruct_data is an optional path to evaluation instruction data to run cross-validation at every eval_freq steps. Cross-validation metrics are displayed as loss and `perp

Notability

notability 8.0/10

High-traction fine-tuning tool from Mistral.