RepoReplicateReplicatepublished Oct 30, 2023seen 5d

replicate/cog-mvdream-multiview

Python

Open original ↗

Captured source

source ↗
published Oct 30, 2023seen 5dcaptured 8hhttp 200method plain

replicate/cog-mvdream-multiview

Description: Cog wrapper for Multi-View Image Generation with MVDream

Language: Python

License: Apache-2.0

Stars: 5

Forks: 1

Open issues: 0

Created: 2023-10-30T17:04:05Z

Pushed: 2023-10-30T17:25:52Z

Default branch: main

Fork: no

Archived: no

README:

Cog MVDream Multi-View

This is an implementation of MVDream's text-to-multi-view image generation module as a Cog model. See the paper, original repository and this Replicate model.

Development

Follow the model pushing guide to push your own fork of MVDream to Replicate.

API Usage

You will need to have Cog and Docker installed on your local to run predictions. To use MVDream, simply describe the scene in natural language, and set stable diffusion generation parameters, camera elevation and/or azimuth angle span if you wish. The model will generate a consistent set of images from different views. API has the following inputs:

  • prompt: What you want to generate expressed in natural language
  • image_size: Width and height of the generated images. allowed values are 128, 256, 512, 1024. Note, larger is better, but slower.
  • num_frames: Number of views to generate.
  • num_inference_steps: Number of diffusion steps. Higher values will lead to better quality, but slower generation.
  • guidance_scale: How much to guide the generation process with the prompt. Higher values will lead to generation that is closer to the prompt, but less diverse or maybe of lower quality.
  • camera_elevation: Elevation angle of the camera.
  • camera_azimuth: Azimuth angle of the camera in the first view.
  • camera_azimuth_span: Total span of the azimuth angle. For example if the span is kept as 360 degrees and num_frames is set to 5 then in each view azimuth angle will be incremented by 360/5=72 degrees.
  • seed: Random seed for the generation process. If not specified, a random seed will be used.

To run a prediction:

cog predict -i prompt="an astronaut riding a horse" -i image_size=512

To build the cog image and launch the API on your local:

cog run -p 5000 python -m cog.server.http

References

@article{shi2023MVDream,
author = {Shi, Yichun and Wang, Peng and Ye, Jianglong and Mai, Long and Li, Kejie and Yang, Xiao},
title = {MVDream: Multi-view Diffusion for 3D Generation},
journal = {arXiv:2308.16512},
year = {2023},
}