What does this repo signal mean?

LG AI Research (EXAONE) published LG-AI-EXAONE/EXAONE-Deep. This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo LG-AI-EXAONE/EXAONE-Deep · New deep model repo from LG AI, moderate traction.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

LG AI Research (EXAONE) Repo: LG-AI-EXAONE/EXAONE-Deep

Captured source

source ↗

GitHub/github.com/LG-AI-EXAONE/EXAONE-Deep

LG-AI-EXAONE/EXAONE-Deep repository metadata

Source ↗

published Mar 12, 2025seen Jun 5captured Jun 11http 200method plain

LG-AI-EXAONE/EXAONE-Deep

Description: Official repository for EXAONE Deep built by LG AI Research

License: NOASSERTION

Stars: 401

Forks: 27

Open issues: 6

Created: 2025-03-12T06:45:27Z

Pushed: 2025-06-02T00:22:33Z

Default branch: main

Fork: no

Archived: no

README:

EXAONE Deep

🤗 Hugging Face &nbsp | &nbsp 📝 Blog &nbsp | &nbsp 📑 Documentation

Introduction

We introduce EXAONE Deep, which exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research. Evaluation results show that 1) EXAONE Deep 2.4B outperforms other models of comparable size, 2) EXAONE Deep 7.8B outperforms not only open-weight models of comparable scale but also a proprietary reasoning model OpenAI o1-mini, and 3) EXAONE Deep 32B demonstrates competitive performance against leading open-weight models.

Our documentation consists of the following sections:

[Performance](#performance): Experimental results of EXAONE Deep models.
[Quickstart](#quickstart): A basic guide to using EXAONE Deep models with Transformers.
[Quantized Models](#quantized-models): An explanation of quantized EXAONE Deep weights in AWQ and GGUF format.
[Run Locally](#run-locally): A guide to running EXAONE Deep models locally with llama.cpp and Ollama frameworks.
[Deployment](#deployment): A guide to running EXAONE Deep models with TensorRT-LLM, vLLM, and SGLang deployment frameworks.
[Usage Guideline](#usage-guideline): A guide to utilizing EXAONE Deep models to achieve the expected performance.

News

2025.03.18: We release the EXAONE Deep, reasoning enhanced language models, including 2.4B, 7.8B, and 32B. Check out the 📑 Documentation!

Performance

Some experimental results are shown below. The full evaluation results can be found in the Documentation.

Models MATH-500 (pass@1) AIME 2024 (pass@1 / cons@64) AIME 2025 (pass@1 / cons@64) CSAT Math 2025 (pass@1) GPQA Diamond (pass@1) Live Code Bench (pass@1)

EXAONE Deep 32B 95.7 72.1 / 90.0 65.8 / 80.0 94.5 66.1 59.5

DeepSeek-R1-Distill-Qwen-32B 94.3 72.6 / 83.3 55.2 / 73.3 84.1 62.1 57.2

QwQ-32B 95.5 79.5 / 86.7 67.1 / 76.7 94.4 63.3 63.4

DeepSeek-R1-Distill-Llama-70B 94.5 70.0 / 86.7 53.9 / 66.7 88.8 65.2 57.5

DeepSeek-R1 (671B) 97.3 79.8 / 86.7 66.8 / 80.0 89.9 71.5 65.9

EXAONE Deep 7.8B 94.8 70.0 / 83.3 59.6 / 76.7 89.9 62.6 55.2

DeepSeek-R1-Distill-Qwen-7B 92.8 55.5 / 83.3 38.5 / 56.7 79.7 49.1 37.6

DeepSeek-R1-Distill-Llama-8B 89.1 50.4 / 80.0 33.6 / 53.3 74.1 49.0 39.6

OpenAI o1-mini 90.0 63.6 / 80.0 54.8 / 66.7 84.4 60.0 53.8

EXAONE Deep 2.4B 92.3 52.5 / 76.7 47.9 / 73.3 79.2 54.3 46.6

DeepSeek-R1-Distill-Qwen-1.5B 83.9 28.9 / 52.7 23.9 / 36.7 65.6 33.8 16.9

Quickstart

You need to install transformers>=4.43.1 for the EXAONE Deep models. The latest version is recommended to use.

Here is the example code to show how to use EXAONE Deep models.

> [!Tip] > In all the examples below, you can use another size model by changing 7.8B to 32B or 2.4B.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
from threading import Thread

model_name = "LGAI-EXAONE/EXAONE-Deep-7.8B"
streaming = True # choose the streaming option

model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Choose your prompt:
# Math example (AIME 2024)
prompt = r"""Let $x,y$ and $z$ be positive real numbers that satisfy the following system of equations:
\[\log_2\left({x \over yz}\right) = {1 \over 2}\]\[\log_2\left({y \over xz}\right) = {1 \over 3}\]\[\log_2\left({z \over xy}\right) = {1 \over 4}\]
Then the value of $\left|\log_2(x^4y^3z^2)\right|$ is $\tfrac{m}{n}$ where $m$ and $n$ are relatively prime positive integers. Find $m+n$.

Please reason step by step, and put your final answer within \boxed{}."""
# Korean MCQA example (CSAT Math 2025)
prompt = r"""Question : $a_1 = 2$인 수열 $\{a_n\}$과 $b_1 = 2$인 등차수열 $\{b_n\}$이 모든 자연수 $n$에 대하여\[\sum_{k=1}^{n} \frac{a_k}{b_{k+1}} = \frac{1}{2} n^2\]을 만족시킬 때, $\sum_{k=1}^{5} a_k$의 값을 구하여라.

Options :
A) 120
B) 125
C) 130
D) 135
E) 140

Please reason step by step, and you should write the correct option alphabet (A, B, C, D or E) within \\boxed{}."""

messages = [
{"role": "user", "content": prompt}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
)

if streaming:
streamer = TextIteratorStreamer(tokenizer)
thread = Thread(target=model.generate, kwargs=dict(
input_ids=input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=32768,
do_sample=True,
temperature=0.6,
top_p=0.95,
streamer=streamer
))
thread.start()

for text in streamer:
print(text, end="", flush=True)
else:
output = model.generate(
input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=32768,
do_sample=True,
temperature=0.6,
top_p=0.95,
)
print(tokenizer.decode(output[0]))

> [!Important] > The EXAONE Deep models are trained with an optimized configuration, > so we recommend following the [Usage Guideline](#usage-guideline) section to achieve optimal performance.

Quantized Models

We introduce a series of quantized weights of EXAONE Deep models.

AWQ

We provide AWQ-quantized weights of EXAONE Deep models, quantized using AutoAWQ library. Please refer to the EXAONE Deep collection for pre-quantized weights, and the AutoAWQ documentation for more details.

You need to install the latest version of AutoAWQ library (autoawq>=0.2.8) to load the AWQ-quantized version of EXAONE Deep models.

pip install autoawq

You can load the model in similar ways to the original models, only changing the model name. It automatically loads with AWQ configuration of the model. Please check the [Quickstart section](#quickstart) above for more details.

GGUF

We provide weights in BF16 format and quantized weights in Q8_0, Q6_K, Q5_K_M, Q4_K_M, IQ4_XS.

The example below is for the 7.8B model in BF16 format. Please refer to the [EXAONE Deep...

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

New deep model repo from LG AI, moderate traction.