amazon-science/Glean
Python
Captured source
source ↗amazon-science/Glean
Language: Python
License: Apache-2.0
Stars: 9
Forks: 1
Open issues: 1
Created: 2024-10-11T22:22:48Z
Pushed: 2026-04-08T02:26:39Z
Default branch: main
Fork: no
Archived: no
README:
GLEAN: Generalized Category Discovery with Diverse and Quality-Enhanced LLM Feedback

This repository contains the implementation of the paper: > GLEAN: Generalized Category Discovery with Diverse and Quality-Enhanced LLM Feedback > [[Paper]](https://arxiv.org/abs/2502.18414)
> Henry Peng Zou, Siffi Singh, Yi Nian, Jianfeng He, Jason Cai, Saab Mansour, Hang Su
Setup
conda create -n glean python=3.9 -y conda activate glean # install pytorch conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia # install dependency pip install -r requirements.txt pip install faiss-gpu==1.7.2 --no-cache-dir
To reproduce our paper results, make sure that you have the following package version installed: transformers==4.15.0, pytorch==2.1.0, 2.1.1 or 2.1.2, as we found that model performance may vary across different package versions, particularly with the transformers package.
Running
First, add you OpenAI API key in line 58 of the 'run.sh' file.
Pre-training, training and testing our model through the bash script:
sh run.sh
You can also add or change parameters in run.sh (More parameters are listed in init_parameter.py)
Bugs or Questions
If you have any questions related to the code or the project, feel free to email Henry Peng Zou ([pzou3@uic.edu](pzou3@uic.edu), [penzou@amazon.com](penzou@amazon.com)). If you encounter any problems when using the code, or want to report a bug, please also feel free to reach out to us. Please try to specify the problem with details so we can help you better and quicker!
Acknowledgement
This repo borrows some data and codes from Loop and JointMatch. We appreciate their great works!
Notability
notability 3.0/10Low traction, routine new repo