Aug 1, 03:53 PM

DeepSeek-Instruct Achieves 56%: Scaling Inference Compute Enhances LLM Performance and Cost-Efficiency

Recent research papers explore the concept of scaling inference compute for large language models (LLMs) by increasing the number of generated samples per input. This method, known as repeated sampling, allows models to make multiple attempts at solving a problem, rather than relying on a single try. The approach has shown promising results, with DeepSeek-Instruct achieving 56% on SWE-Bench-Lite at 250 attempts, significantly outperforming Sonnet 3.5 while being 4.25 times cheaper. Additionally, Llama-8B can surpass 70B models when controlled for FLOPs. The studies indicate that this new dimension of scaling can enhance the performance, cost-efficiency, and coverage of LLMs.

#Sonnet

Written with ChatGPT (GPT-4o).

Sources

Machine Learning@Memoirs
2 years ago
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. https://t.co/w7bLwVLZfa
Azalia Mirhoseini@Azaliamirh
2 years ago
Is inference compute a new dimension for scaling LLMs? In our latest paper, we explore scaling inference compute by increasing the number of samples per input. Across several models and tasks, we observe that coverage – the fraction of problems solved by at least one attempt –… https://t.co/EgpiKm5lmW
Jordan Juravsky@jordanjuravsky
2 years ago
Do you like LLMs? Do you also like for loops? Then you’ll love our new paper! We scale inference compute through repeated sampling: we let models make hundreds or thousands of attempts when solving a problem, rather than just one. By simply sampling more, we can boost LLM… https://t.co/HbpzlbUR2S

Additional media

Image #1 for story deepseek-instruct-achieves-56-scaling-inference-compute-enhances-llm-performance

DeepSeek-Instruct Achieves 56%: Scaling Inference Compute Enhances LLM Performance and Cost-Efficiency

Sources

Additional media

Similar Stories