What does this model signal mean?

LG AI Research (EXAONE) published LGAI-EXAONE/EXAONE-4.5-33B. This model signal is evidence of what shipped on model infrastructure and how the release is positioned. High-signal details: license other · 119K HF downloads · LG's 33B-parameter large language model.. onlylabs links this event to 1 captured evidence page and 6 related model signals.

LG AI Research (EXAONE) Model: LGAI-EXAONE/EXAONE-4.5-33B

Captured source

source ↗

Hugging Face/huggingface.co/LGAI-EXAONE/EXAONE-4.5-33B

LGAI-EXAONE/EXAONE-4.5-33B model card

Source ↗

published Apr 4, 2026seen Jun 6captured Jun 11http 200method plaintask image-text-to-textlicense otherlibrary transformersparams 34Bdownloads 119klikes 178

EXAONE 4.5

We introduce EXAONE 4.5, the first open-weight vision language model developed by LG AI Research. Integrating a dedicated visual encoder into the existing EXAONE 4.0 framework, we expand the model's capability toward multimodality. EXAONE 4.5 features 33 billion parameters in total, including 1.2 billion parameters from the vision encoder. EXAONE 4.5 achieves competitive performance in general benchmark while outperforming SOTA models of similar size in document understanding and Korean contextual reasoning, inheriting powerful language capabilities from our previous language models.

For more details, please refer to the technical report, blog and GitHub.

Model Configuration

Model Type: Causal Language Model + Vision Encoder
Number of Parameters (Language Model): 31.7B
Number of Parameters (Vision Encoder): 1.29B
Hidden Dimension: 5,120
Intermediate size: 27,392
Number of Layers: 64 Main layers + 1 MTP layers
Hybrid Attention Pattern: 16 x (3 Sliding window attention + 1 Global attention)
Reordered Norm: Apply normalization after Attention/MLP, and before residual connection
Sliding Window Attention
Number of Attention Heads: 40 Q-heads and 8 KV-heads
Head Dimension: 128 for both Q/KV
Sliding Window Size: 4096
Global Attention
Number of Attention Heads: 40 Q-heads and 8 KV-heads
Head Dimension: 128 for both Q/KV
No Rotary Positional Embedding Used (NoPE)
Vision Encoder
Grouped Query Attention (GQA)
2D RoPE for vision embeddings
Vocab Size: 153,600
Context Length: 262,144 tokens
Knowledge Cutoff: Dec 2024 (2024/12)

Evaluation Results

Vision-Language Tasks

EXAONE 4.5 33B (Reasoning) GPT-5 mini (Reasoning: high) Qwen3-VL 32B Thinking Qwen3-VL 235B Thinking Qwen3.5 27B (Reasoning)

Architecture Dense - Dense MoE Dense

Total Params 33B - 33B 236B 27B

Active Params 33B - 33B 22B 27B

STEM / Puzzle

MMMU 78.7 79.0 78.1 80.6 82.3

MMMU-Pro 68.6 67.3 68.1 69.3 75.0

MedXpertQA-MM 42.1 34.4 41.6 47.6 62.4

MathVision 75.2 71.9 70.2 74.6 86.0

MathVista (mini) 85.0 79.1 85.9 85.8 87.8

WeMath 79.1 70.3 71.6 74.8 84.0

LogicVista 73.8 70.3 70.9 72.2 77.0

BabyVision 18.8 20.9 17.4 22.2 44.6

Document Understanding

AI2D 89.0 88.2 88.9 89.2 92.9

ChartQAPro 62.2 60.9 61.4 61.2 66.8

CharXiv (RQ) 71.7 68.6 65.2 66.1 79.5

OCRBench v2 63.2 55.8 68.4 66.8 67.3

OmniDocBench v1.5 81.2 77.0 83.1 84.5 88.9

General

MMStar 74.9 74.1 79.4 78.7 81.0

BLINK 68.8 67.7 68.5 67.1 71.6

HallusionBench 63.7 63.2 67.4 66.7 70.0

Korean

KMMMU 42.7 42.6 37.8 42.1 51.7

K-Viscuit 80.1 78.5 78.5 83.9 84.0

KRETA 91.9 94.8 90.3 92.8 96.5

Language-only Tasks

EXAONE 4.5 33B (Reasoning) GPT-5 mini (Reasoning: high) K-EXAONE 236B (Reasoning) Qwen3-VL 235B Thinking Qwen3.5 27B (Reasoning)

Architecture Dense - MoE MoE Dense

Total Params 33B - 236B 236B 27B

Active Params 33B - 23B 22B 27B

Reasoning

AIME 2025 92.9 91.1 92.8 89.7 93.5

AIME 2026 92.6 92.4 92.2 89.4 90.8

GPQA-Diamond 80.5 82.3 79.1 77.1 85.5

LiveCodeBench v6 81.4 78.1 80.7 70.1 80.7

MMLU-Pro 83.3 83.3 83.8 83.8 86.1

Agentic Tool Use

τ2-Bench (Retail) 77.9 78.3 78.6 67.0 84.7

τ2-Bench (Airline) 56.5 60.0 60.4 62.0 67.5

τ2-Bench (Telecom) 73.0 74.1 73.5 44.7 99.3

Instruction Following

IFBench 62.6 74.0 67.3 59.2 76.5

IFEval 89.6 92.8 89.7 88.2 95.0

Long Context Understanding

AA-LCR 50.6 68.0 53.5 58.7 67.3

Korean

KMMLU-Pro 67.6 72.5 67.3 71.1 73.0

KoBALT 52.1 63.6 61.8 51.1 54.9

Quickstart

Serving EXAONE 4.5

For better inference speed and memory usage, it is preferred to serve the model using optimized inference engines. The EXAONE 4.5 model is supported by various frameworks, including TensorRT-LLM, vLLM, SGLang, and llama.cpp. Support will be expanded in the future.

Practically, you can serve the EXAONE 4.5 model with 256K context length on single H200 GPU, or 4x A100-40GB GPUs by using a tensor-parallelism.

TensorRT-LLM

TensorRT-LLM provides zero day support for EXAONE 4.5. Transformers library of our fork is required to utilize EXAONE 4.5 model. You can install Transformers by running the following commands:

pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

Please refer to the official installation guide, and EXAONE documentations, and EXAONE 4.5 PR for the detail.

After you install the TensorRT-LLM, you can launch the server with the following code snippet. You can remove unnecessary arguments from the snippet.

trtllm-serve LGAI-EXAONE/EXAONE-4.5-33B \
—tp_size 2 \
—port 8000 \
—reasoning_parser qwen3

An OpenAI-compatible API server will be available at http://localhost:8000/v1.

vLLM

Both Transformers and vLLM of our forks are required to utilize EXAONE 4.5 model. You can install the requirements by running the following commands:

uv pip install git+https://github.com/lkm2835/vllm.git@add-exaone4_5
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5-v5.3.0.dev0

After you install the vLLM, you can launch the server with the following code snippet. You can remove unnecessary arguments from the snippet.

vllm serve LGAI-EXAONE/EXAONE-4.5-33B \
--served-model-name EXAONE-4.5-33B \
--port 8000 \
--tensor-parallel-size 2 \
--max-model-len 262144 \
--reasoning-parser qwen3 \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--limit-mm-per-prompt '{"image": 64}' \
--speculative_config '{
"method": "mtp",
"num_speculative_tokens": 3
}'

An OpenAI-compatible API server will be available at http://localhost:8000/v1.

SGLang

Both Transformers and SGLang of our forks are required to utilize EXAONE 4.5 model. You can install the requirements by running the following commands:

uv pip install 'git+https://github.com/lkm2835/sglang.git@add-exaone4_5#subdirectory=python&egg=sglang[all]'
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5-v5.3.0.dev0

After you install the SGLang, you can launch the server with the following code snippet. You can remove unnecessary arguments from the snippet.

Excerpt shown — open the source for the full document.

Notability

notability 8.0/10

High downloads from a major lab, notable model