RepoLG AI Research (EXAONE)LG AI Research (EXAONE)published Dec 15, 2025seen 5d

LG-AI-EXAONE/EXAONE-Path-2.5

Python

Open original ↗

Captured source

source ↗
published Dec 15, 2025seen 5dcaptured 9hhttp 200method plain

LG-AI-EXAONE/EXAONE-Path-2.5

Language: Python

License: NOASSERTION

Stars: 5

Forks: 0

Open issues: 0

Created: 2025-12-15T09:37:18Z

Pushed: 2026-03-10T01:53:23Z

Default branch: main

Fork: no

Archived: no

README:

EXAONE Path 2.5

[`Github`] [`Hugging Face`] [`Paper`] [[Cite](#citation)]

Introduction

EXAONE Path 2.5 is a biologically informed multimodal framework that enriches histopathology representations by aligning whole-slide images with *genomic, epigenetic, and transcriptomic data*. By enabling all-pairwise cross-modal alignment across multiple layers of tumor biology, the model captures coherent genotype-to-phenotype relationships within a unified embedding space. This domain-informed design improves resource efficiency, enabling the model to achieve competitive performance across diverse tasks while using substantially fewer training samples and parameters than existing approaches.

Quickstart

Load EXAONE Path 2.5 and extract features.

1. Hardware Requirements ###

  • NVIDIA GPU with 12GB+ VRAM
  • NVIDIA driver version >= 525.60.13 required

Note: This implementation requires NVIDIA GPU and drivers. The provided environment setup specifically uses CUDA-enabled PyTorch, making NVIDIA GPU mandatory for running the model.

2. Environment Setup ###

First, install Micromamba if you haven't already. You can find installation instructions here. Then create and activate the environment using the provided configuration:

git clone https://github.com/LG-AI-EXAONE/EXAONE-Path-2.5.git
cd EXAONE-Path-2.5
micromamba create -n exaonepath python=3.12
micromamba activate exaonepath
pip install -r requirements.txt

3. Inference Workflow Overview

EXAONE Path 2.5 inference follows a two-stage pipeline. (1) Patch-level feature extraction: extract pretrained patch embeddings from either image patches or full WSIs. (2) Slide-level feature extraction: aggregate patch embeddings into slide representations aligned with genomics data. Sections 3.1 and 3.2 describe these steps in detail.

3.1. Patch Feature Extraction

You can extract the pretrained patch features (without multimodal alignment) in two ways.

  • 3.1.1 (Tensor output): for rapid prototyping or custom pipelines
  • 3.1.2 (HDF5 file output): for full WSI processing, visualization, and downstream slide encoding

##### 3.1.1. Tensor output Assuming you have an image, you can run the following code snippet to extract pretrained patch features.

import torch
from PIL import Image
from torchvision import transforms
from transformers import AutoModel

repo_id = "LGAI-EXAONE/EXAONE-Path-2.5"
device = "cuda"

# Input
png_path = "path/to/your/sample_patch.png"

# Load patch encoder
patch_encoder_model = AutoModel.from_pretrained(
repo_id,
component="patch",
trust_remote_code=True,
).to(device).eval()

# Image preprocessing (must match patch encoder training)
transform = transforms.Compose([
transforms.Resize(224),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

img = Image.open(png_path).convert("RGB")
image_tensor = transform(img).unsqueeze(0).to(device) # [1, 3, 224, 224]

with torch.no_grad():
patch_encoder_embedding = patch_encoder_model(image_tensor) # [B=1, C]

Outputs

  • patch_encoder_embedding: a tensor of shape [B=1, C] where B is the batch size, and C is the embedding dimension
  • This tensor can be passed directly to the slide encoder in Section 3.2.

##### 3.1.2. Full WSI patch-feature pipeline (HDF5 Output) The step is further broken into smaller steps.

(1) Generate patch coordinates (and contour indices) with Python function API patchfy that you can import and call directly to:

  • segment tissue regions
  • extract patch coordinates
  • (optionally) write a HDF5 file with coords + contour_index
from exaonepath.patches import patchfy

wsi_path = "path/to/your/slide.svs" # .svs/.tif/.tiff/.ndpi/.mrxs/...
out_dir = "path/to/output_dir"

h5_path, coords, contour_idx = patchfy(
wsi=wsi_path,
out=out_dir,
patch_size=256,
step_size=256,
patch_level=0,
save_h5=True,
save_mask=True,
auto_skip=True,
)

Outputs

  • If save_h5=True, the patches are saved to:
  • /patches/.h5
  • If save_mask=True, a segmentation visualization is saved to:
  • /masks/.jpg
  • If the slide is skipped due to the segmentation safety cap, a reason is written to:
  • /skipped/.txt

Note: the effective patch_size/step_size written to the HDF5 may be MPP-normalized internally (see the patchfy docstring for details).

The returned arrays are:

  • coords: N x 2 int array of patch coordinates (x, y) in level-0 pixel space.
  • contour_idx: int array of length N holding the tissue contour index of each patch.

Useful parameters

  • seg_downsample (float): extra downsampling factor for segmentation only (speed vs accuracy).
  • max_seg_pixels (float): skip very large slides at the chosen segmentation level (set `/patches/.h5" \

--out_h5_path "/patches/_features.h5" \ --batch_size_per_gpu 32

##### Notes
- `coords_h5_path` must be the H5 produced by `patchfy` (`save_h5=True`). Future slide encoder requires `coords`, ideally with `contour_index`.
- The output file (`out_h5_path`) will contain: `features` [N, C], `coords` [N, 2], `contour_index` [N].

#### 3.2. Slide Feature Extraction
Patch features, coordinates, (contour index) must be available. Use the below code snippet if patch feature extraction was conducted with 3.1.2. `patch_features_h5_path` should be identical as `out_h5_path` from the previous step.

import h5py import torch from transformers import AutoModel

device = "cuda"

Load slide encoder (HF)

repo_id = "LGAI-EXAONE/EXAONE-Path-2.5" slide_encoder = AutoModel.from_pretrained( repo_id, component="slide", trust_remote_code=True, ).to(device).eval()

Load patch-level features exported as an HDF5 file

Expected keys: features [N, C], coords [N, 2], contour_index [N]

patch_features_h5_path = "/patches/_features.h5" with h5py.File(patch_features_h5_path, "r") as f: patch_features = torch.from_numpy(f["features"][:]).float() # [N, C] patch_coords = torch.from_numpy(f["coords"][:]).long()…

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Low stars, routine repo