RepoTencent HunyuanTencent Hunyuanpublished May 10, 2024seen 5d

Tencent-Hunyuan/HunyuanDiT

Jupyter Notebook

Open original ↗

Captured source

source ↗
published May 10, 2024seen 5dcaptured 9hhttp 200method plain

Tencent-Hunyuan/HunyuanDiT

Description: Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language: Jupyter Notebook

License: NOASSERTION

Stars: 4292

Forks: 360

Open issues: 125

Created: 2024-05-10T08:47:15Z

Pushed: 2025-11-27T06:37:00Z

Default branch: main

Fork: no

Archived: no

README:

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

-----

This repo contains PyTorch model definitions, pre-trained weights and inference/sampling code for our paper exploring Hunyuan-DiT. You can find more visualizations on our project page.

> **Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding**

> **DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation**

🔥🔥🔥 News!!

  • Dec 17, 2024: :tada: Optimize Lora training with refined grad checkpoint and low-bit optimizer. Just use --lowbit-opt to get started.
  • Sep 13, 2024: 🎉 IPAdapter is officially supported by HunYuanDiT. Document for it: [./ipadapter](./ipadapter). And scaled attention is utilized to replace flash attention on V100 GPUs.
  • Aug 26, 2024, 🎉 HunYuanDIT Controlnet and LoRA are officially supported by ComfyUI. Document for it: [./comfyui](./comfyui)
  • Jul 15, 2024: 🚀 HunYuanDiT and Shakker.Ai have jointly launched a fine-tuning event based on the HunYuanDiT 1.2 model. By publishing a lora or fine-tuned model based on HunYuanDiT, you can earn up to $230 bonus from Shakker.Ai. See Shakker.Ai for more details.
  • Jul 15, 2024: :tada: Update ComfyUI to support standardized workflows and compatibility with weights from t2i module and Lora training for versions 1.1/1.2, as well as those trained by Kohya or the official script.
  • Jul 15, 2024: :zap: We offer Docker environments for CUDA 11/12, allowing you to bypass complex installations and play with a single click! See [dockers](#installation-guide-for-linux) for details.
  • Jul 08, 2024: :tada: HYDiT-v1.2 version is released. Please check HunyuanDiT-v1.2 and Distillation-v1.2 for more details.
  • Jul 03, 2024: :tada: Kohya-hydit version now available for v1.1 and v1.2 models, with GUI for inference. Official Kohya version is under review. See [kohya](./kohya_ss-hydit) for details.
  • Jun 27, 2024: :art: Hunyuan-Captioner is released, providing fine-grained caption for training data. See [mllm](./mllm) for details.
  • Jun 27, 2024: :tada: Support LoRa and ControlNet in diffusers. See [diffusers](./diffusers) for details.
  • Jun 27, 2024: :tada: 6GB GPU VRAM Inference scripts are released. See [lite](./lite) for details.
  • Jun 19, 2024: :tada: ControlNet is released, supporting canny, pose and depth control. See [training/inference codes](#controlnet) for details.
  • Jun 13, 2024: :zap: HYDiT-v1.1 version is released, which mitigates the issue of image oversaturation and alleviates the watermark issue. Please check HunyuanDiT-v1.1 and

Distillation-v1.1 for more details.

  • Jun 13, 2024: :truck: The training code is released, offering [full-parameter training](#full-parameter-training) and [LoRA training](#lora).
  • Jun 06, 2024: :tada: Hunyuan-DiT is now available in ComfyUI. Please check [ComfyUI](#using-comfyui) for more details.
  • Jun 06, 2024: 🚀 We introduce Distillation version for Hunyuan-DiT acceleration, which achieves 50% acceleration on NVIDIA GPUs. Please check Distillation for more details.
  • Jun 05, 2024: 🤗 Hunyuan-DiT is now available in 🤗 Diffusers! Please check the [example](#using--diffusers) below.
  • Jun 04, 2024: :globe_with_meridians: Support Tencent Cloud links to download the pretrained models! Please check the [links](#-download-pretrained-models) below.
  • May 22, 2024: 🚀 We introduce TensorRT version for Hunyuan-DiT acceleration, which achieves 47% acceleration on NVIDIA GPUs. Please check TensorRT-libs for instructions.
  • May 22, 2024: 💬 We support demo running multi-turn text2image generation now. Please check the [script](#using-gradio) below.

🤖 Try it on the web

Welcome to our web-based **Tencent Hunyuan Bot**, where you can explore our innovative products! Just input the suggested prompts below or any other imaginative prompts containing drawing-related keywords to activate the Hunyuan text-to-image generation feature. Unleash your creativity and create any picture you desire, all for free!

You can use simple prompts similar to natural language text

> 画一只穿着西装的猪 > > draw a pig in a suit > > 生成一幅画,赛博朋克风,跑车 > > generate a painting, cyberpunk style, sports car

or multi-turn language interactions to create the picture.

> 画一个木制的鸟 > > draw a wooden bird > > 变成玻璃的 > > turn into glass

🤗 Community Contribution Leaderboard

1. By @TTPlanetPig

  • HunyuanDIT_v1.2 ControlNet models
  • Inpaint controlnet: https://huggingface.co/TTPlanet/HunyuanDiT_Controlnet_inpainting
  • Tile controlnet: https://huggingface.co/TTPlanet/HunyuanDiT_Controlnet_tile
  • Lineart controlnet: https://huggingface.co/TTPlanet/HunyuanDiT_Controlnet_lineart
  • HunyuanDIT_v1.2 ComfyUI nodes
  • Comfyui_TTP_CN_Preprocessor: https://github.com/TTPlanetPig/Comfyui_TTP_CN_Preprocessor
  • Comfyui_TTP_Toolset: https://github.com/TTPlanetPig/Comfyui_TTP_Toolset

2. By @sdbds (bilibili up 青龙圣者)

  • Kohya_ss-hydit train tools: https://github.com/zml-ai/HunyuanDIT-PRE/tree/main/kohya_ss-hydit

3. By @CrazyBoyM (bilibili up 飞鸟白菜)

  • ComfyUI support for HunyuanDIT_v1.2 Controlnet: https://github.com/comfyanonymous/ComfyUI/pull/4245

4. By @L_A_X

  • HunyuanDIT_v1.2 base model for anime
  • Original hf: https://huggingface.co/Laxhar/Freeway_Animation_HunYuan_Demo
  • Converted ComfyUI model:…

Excerpt shown — open the source for the full document.