RepoByteDance (Doubao/Seed)ByteDance (Doubao/Seed)published May 9, 2025seen 5d

ByteDance-Seed/BM-code

Python

Open original ↗

Captured source

source ↗
published May 9, 2025seen 5dcaptured 8hhttp 200method plain

ByteDance-Seed/BM-code

Description: [Arxiv 2025] ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions

Language: Python

License: NOASSERTION

Stars: 45

Forks: 1

Open issues: 1

Created: 2025-05-09T04:57:47Z

Pushed: 2025-06-11T06:14:55Z

Default branch: main

Fork: no

Archived: no

README:

ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions

Di Chang1,2* · Mingdeng Cao1,3* · Yichun Shi1 · Bo Liu1,4 · Shengqu Cai1,5 · Shijie Zhou6

Weilin Huang1 · Gordon Wetzstein5 · Mohammad Soleymani2 · Peng Wang1

1ByteDance Seed 2Unviersity of Southern California 3University of Tokyo

4University of California Berkeley 5Stanford University 6University of California Los Angeles

  • denotes equal contribution

This repo is the official pytorch implementation of ByteMorph, include training, inference and evaluation.

📢 News

📜 Requirements

  • An NVIDIA GPU with CUDA support is required for inference.
  • We have tested on a single A100 and H100 GPU.
  • In our experiment, we used CUDA 12.4.
  • Feel free to visit Flux.1-dev for further details on environment.

🛠️ Dependencies and Installation

Clone the repository:

git clone https://github.com/Boese0601/ByteMorph
cd ByteMorph

Installation Guide

We provide an requirements.txt file for setting up the environment.

Run the following command on your terminal:

# 1. Prepare conda environment
conda create -n bytemorph python=3.10

# 2. Activate the environment
conda activate bytemorph

# 3. Install dependencies
bash env_install.sh

🧱 Download Pretrained Models

We follow the implementation details in our paper and release pretrained weights of the Diffusion Transformer in this huggingface repository. After downloading, please put it under the [pretrained_weights](pretrained_weights/) folder.

The Flux.1-dev VAE and DiT can be found here. The Google-T5 encoder can be found here. The CLIP encoder can be found here.

Please place them under [./pretrained_weights/](pretrained_weights/).

Your file structure should look like this:

ByteMorph
|----...
|----pretrained_weights
|----models--black-forest-labs--FLUX.1-dev
|----flux1-dev.safetensors
|----ae.safetensors
|----...
|----models--xlabs-ai--xflux
|----...
|----models--openai--clip-vit-large-patch14
|----...
|----ByteMorpher
|----dit.safetensors
|----...

Train and Inference

Using Command Line

cd ByteMorph
# Train
bash scripts/train/train.sh

# Inference
bash scripts/test/inference.sh

The config files for trainig and inference can be found in [this file](train_configs/train.yaml) and [this file](inference_configs/inference.yaml).

The DeepSpeed config file for training is [here](train_configs/deepspeed_stage2.yaml).

Evaluation

Please visit [this page](./ByteMorph-Eval/).

🔗 BibTeX Citation

If you find [ByteMorph]() useful for your research and applications, please cite ByteMorph using this BibTeX:

@article{chang2025bytemorph,
title={ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions},
author={Chang, Di and Cao, Mingdeng and Shi, Yichun and Liu, Bo and Cai, Shengqu and Zhou, Shijie and Huang, Weilin and Wetzstein, Gordon and Soleymani, Mohammad and Wang, Peng},
journal={arXiv preprint arXiv:2506.03107},
year={2025}
}

License

This code is distributed under the FLUX.1-dev Non-Commercial License. See LICENSE.txt file for more information.

Acknowledgement

We would like to thank the contributors to the Flux.1-dev, x-flux, OminiControl, for their open-source research.

Disclaimer

Your access to and use of this dataset are at your own risk. We do not guarantee the accuracy of this dataset. The dataset is provided “as is” and we make no warranty or representation to you with respect to it and we expressly disclaim, and hereby expressly waive, all warranties, express, implied, statutory or otherwise. This includes, without limitation, warranties of quality, performance, merchantability or fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. In no event will we be liable to you on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this public license or use of the licensed material.The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability.

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Low stars, routine repo