RepoAmazon (Nova)Amazon (Nova)published Jul 1, 2025seen 5d

amazon-science/Spherical_Diffusion_Policy

Python

Open original ↗

Captured source

source ↗

amazon-science/Spherical_Diffusion_Policy

Description: [ICML 2025] Official implementation of Spherical Diffusion Policy: A SE(3) Equivariant Visuomotor Policy with Spherical Fourier Representation

Language: Python

License: MIT

Stars: 43

Forks: 7

Open issues: 3

Created: 2025-07-01T15:22:16Z

Pushed: 2025-07-08T15:51:59Z

Default branch: main

Fork: no

Archived: no

README:

Spherical Diffusion Policy

By Xupeng Zhu, Fan Wang, Robin Walters, and [Jane Shi]()

Official implementation for **Spherical Diffusion Policy: A SE(3) Equivariant Visuomotor Policy with Spherical Fourier Representation**, to appear at ICML 2025.

Arxiv5min summary video | OpenReview

![](image/SDP.png)

Spherical Diffusion Policy (SDP) is a SE(3) equivariant and T(3) invariant visuomotor policy that leverages spherical Fourier representations to achieve strong 3D generalization in robotic manipulation tasks. SDP introduces three key components: 1. Spherical Fourier Representations for encoding the robot's state and actions with continuous rotational equivariance. 2. Spherical FiLM Conditioning to inject scene embeddings from the vision encoder into the denoising process in an equivariant manner. 3. Spherical Denoising Temporal Unet (SDTU) that supports spatiotemporal equivariant denoising of trajectories.

Our method generalizes well across diverse 3D scene configurations and is benchmarked on 20 simulation tasks using MimicGen and 5 physical single arm or bi-manual robot tasks, consistently outperforming strong baselines like EquiDiff, DiffPo, and ACT.

This repository includes code for:

  • Benchmarking SDP on the MimicGen suite with SE(3) randomized tasks.
  • Training and evaluation scripts for all simulation benchmarks.
  • Dataset generation and preprocessing utilities.

[//]: # (If you find this work helpful, please consider citing our paper (citation to be added upon publication).)

---

Step1: Installation

1. Install the following apt packages for mujoco:

sudo apt install -y libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf

1. Install gfortran (dependency for escnn)

sudo apt install -y gfortran

1. Install Mambaforge (recommended) or Anaconda 1. Clone this repo

git clone https://github.com/amazon-science/Spherical_Diffusion_Policy.git
cd sdp

1. Install environment:

mamba env create -f conda_environment.yaml
conda activate sdp

or:

conda env create -f conda_environment.yaml
conda activate sdp

1. Force reinstall lie-learn (due to a known issue)

pip uninstall lie-learn
pip install git+https://github.com/AMLab-Amsterdam/lie_learn@07469085ac0fd4550fd26ff61cb10bb1e92cead1

1. Install mimicgen:

cd ..
git clone https://github.com/NVlabs/mimicgen_environments.git
cd mimicgen_environments
git checkout 45db4b35a5a79e82ca8a70ce1321f855498ca82c
pip install -e .
cd ../sdp

1. Make sure mujoco version is 2.3.2 (required by mimicgen)

pip list | grep mujoco

Step2: Preparing Dataset

Download Dataset for 12 MimicGen tasks (with SE(2) initialization, _d1 and _d2 tasks):

# Download all datasets
python sdp/scripts/download_datasets.py --tasks stack_d1 stack_three_d1 square_d2 threading_d2 coffee_d2 three_piece_assembly_d2 hammer_cleanup_d1 mug_cleanup_d1 kitchen_d1 nut_assembly_d0 pick_place_d0 coffee_preparation_d1
# Alternatively, download one (or several) datasets of interest, e.g.,
python sdp/scripts/download_datasets.py --tasks stack_d1

[Optional] Preparing tasks and generate dataset for 8 MimicGen tasks with SE(3) initialization (_d3 and _d4 tasks):

Clone the custom repositories:

git clone https://github.com/ZXP-S-works/robosuite.git -b se3
git clone https://github.com/ZXP-S-works/robomimic.git -b for_mimicgen
git clone https://github.com/ZXP-S-works/mimicgen.git -b for_mimicgen

Go to each folder and install all of them:

pip install -e .

To generate demo with img obs, follow: https://mimicgen.github.io/docs/tutorials/reproducing_experiments.html

Generating Point Cloud and Voxel Observation

# Template
python sdp/scripts/dataset_states_to_obs.py --input data/robomimic/datasets/${dataset}/${dataset}.hdf5 --output data/robomimic/datasets/${dataset}/${dataset}_pc.hdf5 --num_workers=12
# Replace [dataset] and [n_worker] with your choices.
# E.g., use 24 workers to generate point cloud and voxel observation for stack_d1
python sdp/scripts/dataset_states_to_obs.py --input data/robomimic/datasets/stack_d1/stack_d1.hdf5 --output data/robomimic/datasets/stack_d1/stack_d1_pc.hdf5 --num_workers=24

Convert Action Space in Dataset

The downloaded dataset has a relative action space. To train with absolute action space, the dataset needs to be converted accordingly

# Template
python sdp/scripts/robomimic_dataset_conversion.py -i data/robomimic/datasets/${dataset}/${dataset}.hdf5 -o data/robomimic/datasets/${dataset}/${dataset}_abs.hdf5 -n 12
# Replace [dataset] and [n_worker] with your choices.
# E.g., convert stack_d1_pc with 12 workers
python sdp/scripts/robomimic_dataset_conversion.py -i data/robomimic/datasets/stack_d1/stack_d1_pc.hdf5 -o data/robomimic/datasets/stack_d1/stack_d1_pc_abs.hdf5 -n 12

[//]: # (python sdp/scripts/dataset_states_to_obs.py --input data/robomimic/datasets/${dataset}/${dataset}.hdf5 --output data/robomimic/datasets/${dataset}/${dataset}_pc.hdf5 --num_workers=12 && python sdp/scripts/robomimic_dataset_conversion.py -i data/robomimic/datasets/${dataset}/${dataset}.hdf5 -o data/robomimic/datasets/${dataset}/${dataset}_abs.hdf5 -n 12)

Step3: Training SDP

Training SDP in stack_d1:

python train.py --config-name=sdp_ddpm_5layer task_name=stack_d1

Training SDP in other tasks, replace stack_d1 with stack_three_d1, square_d2, threading_d2, coffee_d2, three_piece_assembly_d2, hammer_cleanup_d1, mug_cleanup_d1, kitchen_d1, nut_assembly_d0, pick_place_d0,…

Excerpt shown — open the source for the full document.

Notability

notability 4.0/10

Small stars, research repo