What does this repo signal mean?

Amazon (Nova) published amazon-science/Personalized-chat-interaction-autocomplete (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo amazon-science/Personalized-chat-interaction-autocomplete · language Python · New research repo from Amazon Science.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Amazon (Nova) Repo: amazon-science/Personalized-chat-interaction-autocomplete

Captured source

source ↗

GitHub/github.com/amazon-science/Personalized-chat-interaction-autocomplete

amazon-science/Personalized-chat-interaction-autocomplete repository metadata

Source ↗

published Jan 4, 2026seen Jun 5captured Jun 11http 200method plain

amazon-science/Personalized-chat-interaction-autocomplete

Language: Python

License: NOASSERTION

Stars: 0

Forks: 0

Open issues: 0

Created: 2026-01-04T12:55:50Z

Pushed: 2026-01-05T13:05:17Z

Default branch: main

Fork: no

Archived: no

README: Before running experiments, get the needed packages by running:

pip install -r requirements.txt

Experiments are run in 2 stages:

1. Inference - run_baselines.py Generates completions for all prefixes in the dataset for a single model. Output is a completions .pkl file saved in ./completions. Important arguments (full list of arguments can be found in the file):

model_id: any huggingface model. If you have a finetuned model, try to give it a meaningful name that contains the word "finetuned", and then specify a path to S3 for your finetuned model in the code before prepare_model is called.
dataset: currently supported datasets appear in the choices field of this argument. Your chosen dataset have a preperation file, such as src/prepare_prism.py and src/prepare_wildchat.py.
gpu_id: to simultaneously run several models on different GPUs of the same instance, each gpu_id corresponds to a different port.
Inference arguments: best_of, max_new_tokens, top_n_tokens, temperature, top_p.

Make sure to put your huggingface token in ./resources/hf_token.txt.

Example:

python run_baselines.py --dataset wildchat --model_id mistralai/Mistral-7B-v0.1 --gpu_id 0 --best_of 5 --temperature 1.0

2. Metrics - metrics.py Computes metrics such as saved@k and acceptance_rate@k. Outputs a .csv file saved in ./results/saved_at_k. Important arguments (full list of arguments can be found in the file):

model_id: if None, run for all models in models_list. You can specify a single model to run the metrics on.
dataset: used to find the saved completions path, so make sure to use the appropriate name according to the dataset inference was run on.
rank_by: confidence measure to rank completions by, e.g.: perplexity, entropty etc. Currently log_likelihood is the best ranker for all models and datasets, so use it as default if you're not experimenting.

Example:

python src/metrics.py --dataset wildchat --model_id mistralai/Mistral-7B-v0.1 --rank_by log_likelihood --personalization_scheme recency_turns --personalization_r 20

There are other arguments referring to previous experiments, such as word/char prefixes, single word/partial completions etc. The default values of these arguments are the ones we ran our baseline experiments with.

Notability

notability 5.0/10

New research repo from Amazon Science.