VITA Group has released a new report titled 'On The Planning Abilities of OpenAI's o1 Models.' The study by Wang et al. evaluates the feasibility, optimality, and generalizability of OpenAI's o1 models across complex planning and reasoning tasks such as Barman and Floortile. The report highlights the strengths of these models in self-evaluation and constraint-following. However, it also notes that the o1-preview model faces similar qualitative trends as previous large language models (LLMs), particularly in its sensitivity to the probability of certain outcomes.
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability Wang et al.: https://t.co/ATNE1kokEM #AIAgent #ArtificialIntelligence #DeepLearning https://t.co/BKsq5GOobi
On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability Wang et al.: https://t.co/oSa6aSL4Nd #AIAgent #ArtificialIntelligence #DeepLearning https://t.co/yYJCfQC7zC
Is training large models specifically for reasoning enough? This paper reports that large reasoning models like o1-preview, while improving on more difficult tasks, display similar qualitative trends as previous LLMs. "o1—like previous LLMs—is sensitive to the probability of… https://t.co/iYo55lMaEQ