ai21labs/Jamba-tiny-reward-dev
Captured source
source ↗published Dec 5, 2024seen 5dcaptured 9hhttp 200method plainlicense apache-2.0downloads 22klikes 2
This is a tiny Jamba reward model used for development, debugging and experimentation over the Jamba architecture.
It has 319M parameters (instead of 52B in Jamba 1.5 Mini (and Jamba v0.1) and 398B in Jamba 1.5 Large), and was trained on ~40B tokens.
This model was created for unit testing purposes, by turning the first three rows of Jamba-tiny-dev's LM Head into a 3-attribute reward head. The bias was set to [1000, -1000, 0], so the outputs will be in that ballpark. Due to the way it was created, this model does not aim to provide value as a reward model.
Notability
notability 6.0/10Decent downloads for a tiny reward model