
The AI community has introduced a significant advancement in the realm of language models and reward models with the launch of Starling-LM-7B-beta and Starling-RM-34B. The Starling-LM-7B-beta, a cutting-edge 7B language model, has been fine-tuned with Reinforcement Learning from Human Feedback (RLHF), showcasing the latest in AI technology. Alongside, the Starling-RM-34B, a reward model based on the Yi-34B model, has been trained on the Nectar dataset, outperforming previous benchmarks in all categories. This development is accompanied by the introduction of RewardBench, the first benchmark specifically designed for evaluating reward models. RewardBench has evaluated over 30 currently available reward models (including those with DPO) across various domains, including chat, safety, code, and math, providing valuable insights into the performance and capabilities of these models. The Starling-34B-RM has emerged as the leader in this new benchmark, marking a significant achievement for the Starling team and highlighting the importance of reward models in ensuring the development of safe and beneficial AI systems.



Introducing RewardBench: the first benchmark & leaderboard for the reward models used in the RLHF process of LLMs. With RewardBench, we can learn about the RLHF process, which values are learned, and improve the scientific understanding of alignment. https://t.co/ogXSyfIxFK
Introducing RewardBench: the first benchmark & leaderboard for the reward models used in the RLHF process of LLMs. With RewardBench, we can learn about the RLHF process, which values are learned, and improve the scientific understanding of alignment. Learn more:…
Great work from @BanghuaZ and team Top ranking reward model on reward bench from @allen_ai and new starling beta for chat https://t.co/otnorkxpvR