RepoStepFunStepFunpublished May 13, 2025seen 5d

stepfun-ai/Step1X-3D

Python

Open original ↗

Captured source

source ↗
published May 13, 2025seen 5dcaptured 9hhttp 200method plain

stepfun-ai/Step1X-3D

Description: Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets

Language: Python

License: Apache-2.0

Stars: 869

Forks: 59

Open issues: 40

Created: 2025-05-13T03:42:54Z

Pushed: 2025-09-08T22:32:12Z

Default branch: main

Fork: no

Archived: no

README:

中文 &nbsp| &nbsp English&nbsp&nbsp

Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets

Step1X-3D demonstrates the capability to generate 3D assets with high-fidelity geometry and versatile texture maps, while maintaining exceptional alignment between surface geometry and texture mapping. From left to right, we sequentially present: the base geometry (untextured), followed by cartoon-style, sketch-style, and photorealistic 3D asset generation results.

🔥🔥🔥 Latest News!!

  • June 26, 2025: 👋 We release the data preprocessing for shape VAE and diffusion, including advanced watertight method using depth_test and winding_number proposed by CraftsMan3D in path "Step1X-3D/data/watertight_and_sampling.py"!
  • June 9, 2025: 👋 We release the multi-views render code for texture generation model training in path "Step1X-3D/data/ig2mv/render"!
  • May 27, 2025: 👋 We release muti-view generation model with texture sync module!
  • May 13, 2025: 👋 Step1X-3D online demo is available on huggingface-enjoy yourself with generated 3D assets! Huggingface web live
  • May 13, 2025: 👋 We release the 800K uids of high quality 3D assets (excluding self-collected assets) obtained with our rigorous data curation pipeline for both training 3D geometry and synthesis. Huggingface dataset
  • May 13, 2025: 👋 We have also released the training code of both Step1X-3D geometry generation and texture synthesis.
  • May 13, 2025: 👋 We have released the inference code and model weights of Step1X-3D geometry and Step1X-3D texture.
  • May 13, 2025: 👋 We have released Step1X-3D technical report as open source.

📑 Open-source Plan

  • [x] Technical report
  • [x] Inference code & model weights
  • [x] Training code
  • [x] Uid of high quality 3D assets
  • [x] Online demo (gradio deployed on huggingface)
  • [x] Mesh preprocessing (including watertight using depth_test and winding_number, sampling)
  • [ ] More controllable models, like conditioned with multi-view, bounding-box and skeleton
  • [ ] ComfyUI

1. Introduction

While generative artificial intelligence has advanced significantly across text, image, audio, and video domains, 3D generation remains comparatively underdeveloped due to fundamental challenges such as data scarcity, algorithmic limitations, and ecosystem fragmentation. To this end, we present Step1X-3D, an open framework addressing these challenges through: (1) a rigorous data curation pipeline processing >5M assets to create a 2M high-quality dataset with standardized geometric and textural properties; (2) a two-stage 3D-native architecture combining a hybrid VAE-DiT geometry generator with an SD-XL-based texture synthesis module; and (3) the full open-source release of models, training code, and adaptation modules. For geometry generation, the hybrid VAE-DiT component produces watertight TSDF representations by employing perceiver-based latent encoding with sharp edge sampling for detail preservation. The SD-XL-based texture synthesis module then ensures cross-view consistency through geometric conditioning and latent-space synchronization. Benchmark results demonstrate state-of-the-art performance that exceeds existing open-source methods, while also achieving competitive quality with proprietary solutions. Notebly, the framework uniquely bridges 2D and 3D generation paradigms by supporting direct transfer of 2D control techniques(e.g., LoRA) to 3D synthesis. By simultaneously advancing data quality, algorithmic fidelity, and reproducibility, Step1X-3D aims to establish new standards for open research in controllable 3D asset generation.

2. Models Downloading

| Model | Download link | Size | Update date | |-----------------------------|-------------------------------|------------|------| | Step1X-3D-geometry| 🤗 Huggingface | 1.3B | 2025-05-13 | | Step1X-3D-geometry-label | 🤗 Huggingface | 1.3B | 2025-05-13| | Step1X-3D Texture | 🤗 Huggingface | 3.5B |2025-05-13| |Models in ModelScope |🤖 ModelScope | 6.1B | 2025-05-14|

3. Open Filtered High Quaily Datasets

| Data source | Download link | Size | Update date | |-----------------------------|-------------------------------|------------|------| | Objaverse| 🤗Huggingface | 320K |2025-05-13| | Objaverse-XL | 🤗Huggingface | 480K |2025-05-13| | Assets for texture synthesis | 🤗Huggingface | 30K |2025-05-13| | Assets in ModelScope| 🤖ModelScope | 830K |2025-05-14|

Given the above high quality 3D assets, you can follow methods from Dora to preprocess data for VAE and 3D DiT training, and from MV-Adapter for ig2mv training.

4. Dependencies and Installation

The dependencies configured according to the following instructions provide an environment equipped for both training and inference

4.1 Clone the repo

git clone --depth 1 --branch main https://github.com/stepfun-ai/Step1X-3D.git
cd Step1X-3D

> Shallow clone is faster and does not require pulling the gh-pages branch. > > Use the git fetch --unshallow command to convert a shallow clone to a full clone. > > Use the git config remote.origin.fetch '+refs/heads/*:refs/remotes/origin/*' command to fetch all branches.

4.2 Create a new conda environment

conda create -n…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

New repo with 870 stars, moderate traction