ModelLG AI Research (EXAONE)LG AI Research (EXAONE)published Apr 4, 2026seen 5d

LGAI-EXAONE/EXAONE-4.5-33B

Open original ↗

Captured source

source ↗
published Apr 4, 2026seen 5dcaptured 10hhttp 200method plaintask image-text-to-textlicense otherlibrary transformersparams 34Bdownloads 45klikes 167

EXAONE 4.5

We introduce EXAONE 4.5, the first open-weight vision language model developed by LG AI Research. Integrating a dedicated visual encoder into the existing EXAONE 4.0 framework, we expand the model's capability toward multimodality. EXAONE 4.5 features 33 billion parameters in total, including 1.2 billion parameters from the vision encoder. EXAONE 4.5 achieves competitive performance in general benchmark while outperforming SOTA models of similar size in document understanding and Korean contextual reasoning, inheriting powerful language capabilities from our previous language models.

For more details, please refer to the technical report, blog and GitHub.

Model Configuration

  • Model Type: Causal Language Model + Vision Encoder
  • Number of Parameters (Language Model): 31.7B
  • Number of Parameters (Vision Encoder): 1.29B
  • Hidden Dimension: 5,120
  • Intermediate size: 27,392
  • Number of Layers: 64 Main layers + 1 MTP layers
  • Hybrid Attention Pattern: 16 x (3 Sliding window attention + 1 Global attention)
  • Reordered Norm: Apply normalization after Attention/MLP, and before residual connection
  • Sliding Window Attention
  • Number of Attention Heads: 40 Q-heads and 8 KV-heads
  • Head Dimension: 128 for both Q/KV
  • Sliding Window Size: 4096
  • Global Attention
  • Number of Attention Heads: 40 Q-heads and 8 KV-heads
  • Head Dimension: 128 for both Q/KV
  • No Rotary Positional Embedding Used (NoPE)
  • Vision Encoder
  • Grouped Query Attention (GQA)
  • 2D RoPE for vision embeddings
  • Vocab Size: 153,600
  • Context Length: 262,144 tokens
  • Knowledge Cutoff: Dec 2024 (2024/12)

Evaluation Results

Vision-Language Tasks

EXAONE 4.5 33B (Reasoning) GPT-5 mini (Reasoning: high) Qwen3-VL 32B Thinking Qwen3-VL 235B Thinking Qwen3.5 27B (Reasoning)

Architecture Dense - Dense MoE Dense

Total Params 33B - 33B 236B 27B

Active Params 33B - 33B 22B 27B

STEM / Puzzle

MMMU 78.7 79.0 78.1 80.6 82.3

MMMU-Pro 68.6 67.3 68.1 69.3 75.0

MedXpertQA-MM 42.1 34.4 41.6 47.6 62.4

MathVision 75.2 71.9 70.2 74.6 86.0

MathVista (mini) 85.0 79.1 85.9 85.8 87.8

WeMath 79.1 70.3 71.6 74.8 84.0

LogicVista 73.8 70.3 70.9 72.2 77.0

BabyVision 18.8 20.9 17.4 22.2 44.6

Document Understanding

AI2D 89.0 88.2 88.9 89.2 92.9

ChartQAPro 62.2 60.9 61.4 61.2 66.8

CharXiv (RQ) 71.7 68.6 65.2 66.1 79.5

OCRBench v2 63.2 55.8 68.4 66.8 67.3

OmniDocBench v1.5 81.2 77.0 83.1 84.5 88.9

General

MMStar 74.9 74.1 79.4 78.7 81.0

BLINK 68.8 67.7 68.5 67.1 71.6

HallusionBench 63.7 63.2 67.4 66.7 70.0

Korean

KMMMU 42.7 42.6 37.8 42.1 51.7

K-Viscuit 80.1 78.5 78.5 83.9 84.0

KRETA 91.9 94.8 90.3 92.8 96.5

Language-only Tasks

EXAONE 4.5 33B (Reasoning) GPT-5 mini (Reasoning: high) K-EXAONE 236B (Reasoning) Qwen3-VL 235B Thinking Qwen3.5 27B (Reasoning)

Architecture Dense - MoE MoE Dense

Total Params 33B - 236B 236B 27B

Active Params 33B - 23B 22B 27B

Reasoning

AIME 2025 92.9 91.1 92.8 89.7 93.5

AIME 2026 92.6 92.4 92.2 89.4 90.8

GPQA-Diamond 80.5 82.3 79.1 77.1 85.5

LiveCodeBench v6 81.4 78.1 80.7 70.1 80.7

MMLU-Pro 83.3 83.3 83.8 83.8 86.1

Agentic Tool Use

τ2-Bench (Retail) 77.9 78.3 78.6 67.0 84.7

τ2-Bench (Airline) 56.5 60.0 60.4 62.0 67.5

τ2-Bench (Telecom) 73.0 74.1 73.5 44.7 99.3

Instruction Following

IFBench 62.6 74.0 67.3 59.2 76.5

IFEval 89.6 92.8 89.7 88.2 95.0

Long Context Understanding

AA-LCR 50.6 68.0 53.5 58.7 67.3

Korean

KMMLU-Pro 67.6 72.5 67.3 71.1 73.0

KoBALT 52.1 63.6 61.8 51.1 54.9

Quickstart

Serving EXAONE 4.5

For better inference speed and memory usage, it is preferred to serve the model using optimized inference engines. The EXAONE 4.5 model is supported by various frameworks, including TensorRT-LLM, vLLM, SGLang, and llama.cpp. Support will be expanded in the future.

Practically, you can serve the EXAONE 4.5 model with 256K context length on single H200 GPU, or 4x A100-40GB GPUs by using a tensor-parallelism.

TensorRT-LLM

TensorRT-LLM provides zero day support for EXAONE 4.5. Transformers library of our fork is required to utilize EXAONE 4.5 model. You can install Transformers by running the following commands:

pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

Please refer to the official installation guide, and EXAONE documentations, and EXAONE 4.5 PR for the detail.

After you install the TensorRT-LLM, you can launch the server with the following code snippet. You can remove unnecessary arguments from the snippet.

trtllm-serve LGAI-EXAONE/EXAONE-4.5-33B \
—tp_size 2 \
—port 8000 \
—reasoning_parser qwen3

An OpenAI-compatible API server will be available at http://localhost:8000/v1.

vLLM

Both Transformers and vLLM of our forks are required to utilize EXAONE 4.5 model. You can install the requirements by running the following commands:

uv pip install git+https://github.com/lkm2835/vllm.git@add-exaone4_5
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5-v5.3.0.dev0

After you install the vLLM, you can launch the server with the following code snippet. You can remove unnecessary arguments from the snippet.

vllm serve LGAI-EXAONE/EXAONE-4.5-33B \
--served-model-name EXAONE-4.5-33B \
--port 8000 \
--tensor-parallel-size 2 \
--max-model-len 262144 \
--reasoning-parser qwen3 \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--limit-mm-per-prompt '{"image": 64}' \
--speculative_config '{
"method": "mtp",
"num_speculative_tokens": 3
}'

An OpenAI-compatible API server will be available at http://localhost:8000/v1.

SGLang

Both Transformers and SGLang of our forks are required to utilize EXAONE 4.5 model. You can install the requirements by running the following commands:

uv pip install 'git+https://github.com/lkm2835/sglang.git@add-exaone4_5#subdirectory=python&egg=sglang[all]'
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5-v5.3.0.dev0

After you install the SGLang, you can launch the server with the following code snippet. You can remove unnecessary arguments from the snippet.

Excerpt shown — open the source for the full document.

Notability

notability 8.0/10

High downloads from a major lab, notable model