What does this repo signal mean?

Google (DeepMind / Gemini) published google-deepmind/alignet (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo google-deepmind/alignet · language Python · Notable lab but low traction (77 stars). onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Google (DeepMind / Gemini) Repo: google-deepmind/alignet

Captured source

source ↗

GitHub/github.com/google-deepmind/alignet

google-deepmind/alignet repository metadata

Source ↗

published May 30, 2025seen 6dcaptured 8hhttp 200method plain

google-deepmind/alignet

Language: Python

License: Apache-2.0

Stars: 79

Forks: 13

Open issues: 3

Created: 2025-05-30T09:37:36Z

Pushed: 2025-12-01T16:22:04Z

Default branch: main

Fork: no

Archived: no

README:

AligNet Project Training Code, Data, and Model Checkpoints

This repository contains code and dataset information for "Aligning Machine and Human Visual Representations across Abstraction Levels." Specifically, it includes the code for finetuning a pretrained SigLIP model on the AligNet dataset, as well as links and documentation for the dataset and the aligned model checkpoints.

Quick links:

[Installation](#installation)
[AligNet dataset](#alignet-dataset)
[Run AligNet finetuning on SigLIP](#run-alignet-finetuning-on-siglip)
[Released AligNet models](#alignet-models)
[Citation](#citation)
[License](#license)

Motivation

Alignment with human mental representations is becoming central to representation learning: we want neural network models that perform well on downstream tasks and align with the hierarchical nature of human semantic cognition. We believe that aligning neural network representations with human conceptual knowledge will lead to models that generalize better, are more robust, safer, and practically more useful. To obtain such models, we generated a synthetic human-like similarity judgment dataset on a much larger scale than has previously been possible. We have released this dataset, example finetuning code for using it, and some finetuned versions of prior models.

Please see the AligNet paper for further details on the motivation and procedures.

Installation

Clone Repository

git clone https://github.com/google-deepmind/alignet.git

Install requirements

pip install -r alignet/requirments.txt

AligNet dataset

The AligNet dataset is a synthetically generated dataset of image triplets (sampled from ImageNet2012) and corresponding human-like triplet odd-one-out choices.

AligNet triplets

Download the data from https://storage.googleapis.com/alignet/data/release_1.1/index.html

AligNet is a dataset of triplets and corresponding odd-one-out choices. Each triplet contains 3 image filenames (the images are sampled from ImageNet) and the predicted similarity between those three images (obtained from a pre-trained neural network).

To increase the reproducibility of our research, we split AligNet into a training and a validation set. The train split alignet_train.npz contains 10M triplets and the validation split alignet_valid.npz contains 10k triplets. The files are stored in Numpy’s compressed array format. Each file contains three arrays of *n* entries each, where n=10M for training and n=10k for validation. Row *i* describes the *i*th triplet. Note that within each triplet we sorted the images such that the last image is always the one that is most dissimilar to the other two (i.e., the "odd-one-out"), according to a prediction made by a model we trained (see the AligNet paper for details).

filenames: (n, 3) strings: Identifies the images used for this triplet.

Each row contains the names of image files from the ImageNet2012 dataset as [filename0, filename1, filename2], where filename2 is the image that is typically considered the "odd one out" of the triplet.

similarities: (n, 3) floats: the similarity values of the three pairs of

images calculated using the pretrained model representations: [s01, s02, s12], where sij is the similarity between image i and image j. Note that the data was sorted such that `s12

property value

name AligNet Dataset

url https://github.com/google-deepmind/alignet

sameAs https://github.com/google-deepmind/alignet

description

A dataset of synthetic Human Preference Triplets based on ImageNet.

provider

property value

name DeepMind

sameAs https://en.wikipedia.org/wiki/DeepMind

citation Muttenthaler L, Greff K, Born F, Spitzer B, Kornblith S, Mozer MC, Müller KR, Unterthiner T, Lampinen AK (2025). Aligning machine and human visual representations across abstraction levels. Nature, 647, 349-355

License

The AligNet dataset is under the CC-BY License, and the accompanying code is provided under an Apache 2.0 License. Other parts of the datasets are under the original license of their sub-parts. The aligned model checkpoints are governed by their original licenses; license information is provided along with the checkpoints.

This is not an officially supported Google product.

Notability

notability 5.0/10

Notable lab but low traction (77 stars)