OptionVotes
YES
NO
265
38
OptionProbability
Pretraining data composition
Doesn't use any scale.ai training data
Offline policy learning RLHf
Task vectors like golden gate Claude
88
63
49
37