What does this repo signal mean?

Upstage (Solar) published UpstageAI/evalverse-IFEval (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo UpstageAI/evalverse-IFEval · language Python. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Upstage (Solar) Repo: UpstageAI/evalverse-IFEval

Captured source

source ↗

GitHub/github.com/UpstageAI/evalverse-IFEval

UpstageAI/evalverse-IFEval repository metadata

Source ↗

published Mar 28, 2024seen 5dcaptured 10hhttp 200method plain

UpstageAI/evalverse-IFEval

Description: Submodule of evalverse forked from google-research/instruction_following_eval

Language: Python

Stars: 14

Forks: 4

Open issues: 2

Created: 2024-03-28T14:54:38Z

Pushed: 2024-05-04T01:50:27Z

Default branch: main

Fork: no

Archived: yes

README:

IFEval: Instruction Following Eval

This is not an officially supported Google product.

Dependencies

Please make sure that all required python packages are installed via:

pip install -r requirements.txt

How to run

We will use vLLM to generate responses for the instruction prompts via the python file inst_eval.py

python inst_eval.py \
--model {ckpt_path} --model_ref_id {model_ref_id} \
--output_path {ckpt_path}/eval_vllm \

ckpt_path: Path to the model checkpoints, not ending with /.
model_ref_id: A shorthand name for the model. This will be used in the path to save the evaluation results.

At the moment, you can specify --devices and --gpu_per_inst_eval to set total number of GPUs and GPUs per inst_eval process (e.g. vLLM). However, as there are slight variations with differing number of GPUs and GPUs per inst_eval process, using the default value of --devices and --gpu_per_inst_eval is recommended for reproducible evaluation results.