RepoNVIDIANVIDIApublished Jul 18, 2025seen 2d

NVIDIA/3DObjectReconstruction

Python

Open original ↗

Captured source

source ↗
published Jul 18, 2025seen 2dcaptured 14hhttp 200method plain

NVIDIA/3DObjectReconstruction

Description: 3D Object Reconstruction project is a workflow that takes a set of stereo images and camera info and outputs a textured mesh (i.e., .OBJ file). The purpose is to translate physical items into the digital world in a photorealistic way

Language: Python

License: Apache-2.0

Stars: 178

Forks: 15

Open issues: 1

Created: 2025-07-18T17:35:14Z

Pushed: 2026-06-09T19:35:12Z

Default branch: main

Fork: no

Archived: no

README:

NVIDIA 3D Object Reconstruction Framework

🆕 Updates & News

  • June, 2026: We release Commercial Version Of 3D Object Reconstruction Pipeline!

1. We use Theseus optimizers for pose optimization to replace BundleFusion. 2. Add optional config for color fusion across views to handle varying lighting conditions. (See [Configuration](#configuration) for usage) 3. We now use commercial version FoundationStereo ONNX weight from NVIDIA TAO Toolkit. Now could be enabled through FoundationStereo [configuration](#configuration). 4. Add multi-architecture Docker support with automatic detection: deployment now automatically detects x86_64 vs ARM64 (tested on Jetson Orin) and uses the appropriate Dockerfile (Dockerfile or Dockerfile.aarch64) and upgrade to pytorch 2.7.0 for better compatiblity especially for newer architechures. 5. Add USD Exporting, Now we will export USD and USDZ of the reconstructed object for further usage.

Framework Overview

Deploy this example to create a 3D object reconstruction workflow that transforms stereo video input into high-quality 3D assets using state-of-the-art computer vision and neural rendering techniques.

NVIDIA's 3D Object Reconstruction framework represents a significant advancement in automated 3D asset creation. Real-world tests have demonstrated the ability to generate production-ready 3D meshes with photorealistic textures in under 30 minutes, enabling rapid digital twin creation and synthetic data generation workflows.

The purpose of this framework is:

1. To provide a reference implementation of 3D object reconstruction using NVIDIA's AI stack. 2. To accelerate adoption of 3D AI workflows in computer vision, robotics, and synthetic data generation.

You can get started quickly and achieve high-quality results using your own stereo data by following the [Quickstart guide](#quick-start-recommended).

  • [NVIDIA 3D Object Reconstruction Framework](#nvidia-3d-object-reconstruction-framework)
  • [What is 3D Object Reconstruction?](#what-is-3d-object-reconstruction)
  • [How to Use This Framework](#how-to-use-this-framework)
  • [Preparing your data](#preparing-your-data)
  • [1 – Input data schema](#1--input-data-schema)
  • [2 – Camera calibration](#2--camera-calibration)
  • [3 – Data organization](#3--data-organization)
  • [Real-World Results and What to Expect](#real-world-results-and-what-to-expect)
  • [Additional Reading](#additional-reading)
  • [Quick Start (Recommended)](#quick-start-recommended)
  • [Prerequisites](#prerequisites)
  • [🎬 Complete Setup](#-complete-setup)
  • [🎯 Interactive Experience](#-interactive-experience)
  • [Technical Details](#technical-details)
  • [Software Components](#software-components)
  • [Technical Diagrams](#technical-diagrams)
  • [Minimum System Requirements](#minimum-system-requirements)
  • [Hardware Requirements](#hardware-requirements)
  • [Software Requirements](#software-requirements)
  • [Development Environment](#development-environment)
  • [BundleSDF Use Cases and Applications](#bundlesdf-use-cases-and-applications)
  • [Pipeline Overview](#pipeline-overview)
  • [Usage Methods](#usage-methods)
  • [1. Interactive Jupyter Notebook (Recommended)](#1-interactive-jupyter-notebook-recommended)
  • [2. Command Line Interface](#2-command-line-interface)
  • [Configuration](#configuration)
  • [Available Customizations](#available-customizations)
  • [FAQ / Known Issues](#faq--known-issues)
  • [Common Setup and Runtime Issues](#common-setup-and-runtime-issues)
  • [Claude Code Skills](#claude-code-skills)
  • [Citation](#citation)
  • [License](#license)
  • [Disclaimer](#disclaimer)

What is 3D Object Reconstruction?

3D Object Reconstruction is the process of creating complete three-dimensional digital representations of real-world objects from 2D image sequences. This example implements a state-of-the-art pipeline that combines stereo vision, object segmentation, bundle adjustment, and neural implicit surface reconstruction to produce high-quality 3D meshes with photorealistic textures.

The reconstruction pipeline processes stereo image pairs through four main stages: depth estimation using transformer-based FoundationStereo, object segmentation with SAM2, camera pose tracking via BundleSDF, and neural implicit surface reconstruction using NeRF. The result is production-ready 3D assets compatible with Isaac Sim, Omniverse, and game engines.

The pipeline comprises the following components with different tasks:

  • FoundationStereo: Transformer-based stereo depth estimation with sub-pixel accuracy
  • SAM2: Video object segmentation for consistent mask generation
  • Pose Estimation: CUDA-accelerated pose estimation and optimization
  • Neural SDF: GPU-optimized neural implicit surface reconstruction
  • RoMa: Robust feature matching for correspondence establishment

How to Use This Framework

This reference implementation demonstrates proven techniques for high-quality 3D reconstruction. Key capabilities include:

  • Direct stereo video processing without extensive preprocessing
  • Automated camera pose estimation and bundle adjustment
  • Neural implicit surface representation for smooth geometry
  • Photorealistic texture generation through view synthesis

To effectively use this framework:

1. Learn from the reference implementation

  • Deploy the stack: Follow the Docker Compose setup to experience the complete pipeline
  • Study the notebook: Walk through the interactive Jupyter tutorial for hands-on learning
  • Understand the architecture: Review the code to see how FoundationStereo, SAM2, and BundleSDF integrate

2. Prepare your stereo data

  • Capture stereo sequences: Record synchronized left/right camera pairs of your target objects
  • Calibrate cameras: Ensure accurate intrinsic and extrinsic camera parameters
  • Organize data: Structure input…

Excerpt shown — open the source for the full document.