RepoOpenAIOpenAIpublished Jun 22, 2022seen 5d

openai/Video-Pre-Training

Python

Open original ↗

Captured source

source ↗
published Jun 22, 2022seen 5dcaptured 10hhttp 200method plain

openai/Video-Pre-Training

Description: Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Language: Python

License: MIT

Stars: 1694

Forks: 167

Open issues: 17

Created: 2022-06-22T18:06:56Z

Pushed: 2025-09-03T21:51:23Z

Default branch: main

Fork: no

Archived: no

README:

Video-Pre-Training

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

> :page_facing_up: Read Paper \ :mega: Blog Post \ :space_invader: MineRL Environment (note version 1.0+ required) \ :checkered_flag: MineRL BASALT Competition

Running agent models

Install pre-requirements for MineRL. Then install requirements with:

pip install git+https://github.com/minerllabs/minerl
pip install -r requirements.txt

> ⚠️ Note: For reproducibility reasons, the PyTorch version is pinned as torch==1.9.0, which is incompatible with Python 3.10 or higher versions. If you are using Python 3.10 or higher, install a newer version of PyTorch (usually, pip install torch). However, note that this *might* subtly change model behaviour (e.g., still act mostly as expected, but not reaching the reported performance).

To run the code, call

python run_agent.py --model [path to .model file] --weights [path to .weight file]

After loading up, you should see a window of the agent playing Minecraft.

Agent Model Zoo

Below are the model files and weights files for various pre-trained Minecraft models. The 1x, 2x and 3x model files correspond to their respective model weights width.

Demonstration Only - Behavioral Cloning

These models are trained on video demonstrations of humans playing Minecraft using behavioral cloning (BC) and are more general than later models which use reinforcement learning (RL) to further optimize the policy. Foundational models are trained across all videos in a single training run while house and early game models refine their respective size foundational model further using either the housebuilding contractor data or early game video sub-set. See the paper linked above for more details.

Foundational Model :chart_with_upwards_trend:

Fine-Tuned from House :chart_with_upwards_trend:

Fine-Tuned from Early Game :chart_with_upwards_trend:

Models With Environment Interactions

These models further refine the above demonstration based models with a reward function targeted at obtaining diamond pickaxes. While less general then the behavioral cloning models, these models have the benefit of interacting with the environment using a reward function and excel at progressing through the tech tree quickly. See the paper for more information on how they were trained and the exact reward schedule.

RL from Foundation :chart_with_upwards_trend:

RL from House :chart_with_upwards_trend:

RL from Early Game :chart_with_upwards_trend:

Running Inverse Dynamics Model (IDM)

IDM aims to predict what actions player is taking in a video recording.

Setup:

  • Install requirements: pip install -r requirements.txt
  • Download the IDM model .model :arrow_down: and .weight :arrow_down: files
  • For demonstration purposes, you can use the contractor recordings shared below to. For this demo we use

this .mp4 and this associated actions file (.jsonl).

To run the model with above files placed in the root directory of this code:

python run_inverse_dynamics_model.py --weights 4x_idm.weights --model 4x_idm.model --video-path cheeky-cornflower-setter-02e496ce4abb-20220421-092639.mp4 --jsonl-path cheeky-cornflower-setter-02e496ce4abb-20220421-092639.jsonl

A window should pop up which shows the video frame-by-frame, showing the predicted and true (recorded) actions side-by-side on the left.

Note that run_inverse_dynamics_model.py is designed to be a demo of the IDM, not code to put it into practice.

Using behavioural cloning to fine-tune the models

Disclaimer: This code is a rough demonstration only and not an exact recreation of what original VPT paper did (but it contains some preprocessing steps you want to be aware of)! As such, do not expect replicate the original experiments with this code. This code has been designed to be run-able on consumer hardware (e.g., 8GB of VRAM).

Setup:

  • Install requirements: pip install -r requirements.txt
  • Download .weights and .model file…

Excerpt shown — open the source for the full document.

Notability

Scored, but no written rationale attached yet.

OpenAI has a repo signal matching data demand, infrastructure.