What does this repo signal mean?

ByteDance (Doubao/Seed) published ByteDance-Seed/In-Place-TTT (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo ByteDance-Seed/In-Place-TTT · language Python · Test-time training with in-place updates by ByteDance Seed.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

ByteDance (Doubao/Seed) Repo: ByteDance-Seed/In-Place-TTT

Captured source

source ↗

GitHub/github.com/ByteDance-Seed/In-Place-TTT

ByteDance-Seed/In-Place-TTT repository metadata

Source ↗

published Apr 7, 2026seen Jun 5captured Jun 11http 200method plain

ByteDance-Seed/In-Place-TTT

Language: Python

License: Apache-2.0

Stars: 231

Forks: 24

Open issues: 5

Created: 2026-04-07T05:50:45Z

Pushed: 2026-04-21T03:50:51Z

Default branch: main

Fork: no

Archived: no

README:

You can get to know us better through the following channels👇

!seed logo

In-Place Test-Time Training

Seamlessly Endowing LLMs with Test-Time Training Ability

Guhao Feng\*, Shengjie Luo\*, Kai Hua, Ge Zhang, Wenhao Huang, Di He, Tianle Cai

In-Place TTT is a drop-in test-time training method for Transformer LLMs. This repository provides the training, checkpoint conversion, inference, and evaluation stack built on VeOmni, together with recommended configs for Qwen3-8B and LLaMA-3.1-8B.

News

[2026/03] The codebase is open-sourced.

[2026/02] In-Place TTT is accepted to ICLR 2026 as an Oral presentation.

[In-Place Test-Time Training](#in-place-test-time-training)
[News](#news)
[Table of Contents](#table-of-contents)
[Introduction](#introduction)
[Getting Started](#getting-started)
[Environment Setup](#environment-setup)
[Data Preparation](#data-preparation)
[Recommended Config](#recommended-config)
[Training](#training)
[Checkpoint Conversion](#checkpoint-conversion)
[Evaluation](#evaluation)
[Features](#features)
[License](#license)
[Citation](#citation)
[About ByteDance Seed Team](#about-bytedance-seed-team)

Introduction

Current large language models follow a static "train then deploy" paradigm. Once deployed, model weights are frozen and cannot adapt to new information encountered during inference. This limits long-context reasoning, where useful information arrives progressively and the model would benefit from updating itself as it reads.

In-Place Test-Time Training (In-Place TTT) addresses this by updating a subset of model parameters, the MLP down-projection fast weights, during inference. Unlike prior TTT approaches that require architectural side modules or external memory, In-Place TTT stays inside the standard Transformer block and remains compatible with off-the-shelf autoregressive LLMs.

The method is centered around three ideas:

1. Architectural compatibility. Fast weights live in the existing MLP down-projection matrix, so no extra attention heads or memory modules are introduced. 2. LM-aligned objective. The fast-weight update is aligned with next-token prediction instead of a generic reconstruction target. 3. Chunk-wise update. Long sequences are split into chunks so updates can be computed efficiently and scaled to long contexts.

![In-Place TTT Method Overview](assets/pipeline.png)

As used in this repo, the end-to-end workflow is:

1. Provide your own VeOmni-compatible processed dataset and base model assets. 2. Launch continual pretraining with VeOmni through train.sh and tasks/train_torch.py. 3. Export DCP checkpoints into HuggingFace format with scripts/merge_dcp_to_hf.py. 4. Run TTT-aware inference and RULER evaluation with inference_model/, eval.sh, and eval_config/.

The repository includes recommended training configs for Qwen3-8B and LLaMA-3.1-8B, checkpoint conversion utilities, and a full RULER evaluation pipeline via OpenCompass from 4K to 256K context lengths.

Getting Started

Environment Setup

Step 1. Install PyTorch and FlashAttention:

pip3 install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128

wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp311-cp311-linux_x86_64.whl
pip3 install flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp311-cp311-linux_x86_64.whl
rm flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp311-cp311-linux_x86_64.whl

Step 2. Install VeOmni from the validated commit:

pip3 install "veomni @ git+https://github.com/ByteDance-Seed/VeOmni.git@9b91e164bea9e17f17ed490aab5e076c2335ca25"

Step 3. Install the remaining dependencies:

pip3 install liger-kernel
pip3 install byted-wandb torchdata blobfile datasets diffusers tiktoken timm
pip3 install transformers==4.57.3
pip3 install opt_einsum einops

pip3 uninstall -y byted-wandb wandb
pip3 install byted-wandb

Step 4. Optionally verify the installed VeOmni source:

python3 - **Tip:** Set `ttt_target: input_embed` for from-scratch pretraining, or `ttt_target: hidden_states` for continual training.

model: model_path: /path/to/your_base_model foundation: ttt_layers: [0, 6, 12, 18, 24, 30, 36] ttt_mode: true ttt_proj: true ttt_lr: 3 ttt_chunk: 4096

data: train_path: /path/to/your_data train_size: 20000000000 dataloader_type: native datasets_type: iterable data_type: plaintext max_seq_len: 65536 text_keys: content_split drop_last: true

train: output_dir: /path/to/your_output_dir data_parallel_mode: fsdp2 global_batch_size: 64 micro_batch_size: 1 optimizer: adamw lr: 5.0e-6 lr_warmup_ratio: 0.02 lr_decay_style: cosine lr_decay_ratio: 0.90 weight_decay: 0.1 max_grad_norm: 1.0 max_steps: 5000 enable_mixed_precision: true enable_gradient_checkpointing: true enable_full_shard: true init_device: meta ckpt_manager: dcp save_steps: 500 save_hf_weights: true use_wandb: true

The corresponding recommended config files are:

- `configs/pretrain/qwen3_longct.yaml`
- `configs/pretrain/llama3_longct.yaml`

### Training

Quick smoke run:

bash train.sh tasks/train_torch.py configs/pretrain/qwen3_longct.yaml \ --train.output_dir /path/to/your_output_dir \ --train.max_steps 1 \ --train.use_wandb false

Recommended Qwen config override:

bash train.sh tasks/train_torch.py configs/pretrain/qwen3_longct.yaml \ --train.wandb_project your_wandb_project \ --train.wandb_name your_run_name \ --train.output_dir /path/to/your_output_dir \ --model.foundation '{"ttt_layers":[0,6,12,18,24,30,36],"ttt_mode":true,"ttt_proj":true,"ttt_lr":3,"ttt_chunk":4096}'

Recommended LLaMA config override:

bash train.sh tasks/train_torch.py configs/pretrain/llama3_longct.yaml \ --train.wandb_project your_wandb_project \ --train.wandb_name your_run_name \ --train.output_dir /path/to/your_output_dir \ --model.foundation '{"ttt_layers":[0,6,12,18,24,30,36],"ttt_mode":true,"ttt_proj":true,"ttt_lr":3,"ttt_chunk":4096}'

### Checkpoint Conversion

Convert VeOmni DCP checkpoints into HuggingFace...

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

New repo with moderate stars