Starling AI introduces cutting-edge language models Starling-LM-7B-beta and Starling-RM-34B, surpassing previous models in benchmarks. The community lauds the new RewardBench benchmark evaluating 30+ reward models, with Starling-34B-RM leading the leaderboard.
Great work from @BanghuaZ and team Top ranking reward model on reward bench from @allen_ai and new starling beta for chat https://t.co/otnorkxpvR
Reward models are the essence of success in RLHF, yet there has been little focus on evaluating them 😬 We introduce RewardBench💥 the first benchmark for reward models. We evaluated 30+ of the existing RMs (w/ DPO) and created new datasets. Discover lots of insightful analyses👇 https://t.co/q9XvVpPDwD
📢Exciting release of Starling-7B-beta chat model and Starling-34B-RM reward model powered by Nexusflow latest technology. I am continuously amazed by how fast and powerful the small striking team behind Starling is! https://t.co/luF4RUVqBe