Sources
Casper HansenOfficial entropix evals from the authors. A 20% relative improvement on GPQA zero-shot is insane! Notes: - Qwen 2.5 500m was used here - You cannot evaluate on the logprobs because entropix is a sampler - You need string matching for evals (incompatible with lm_eval) https://t.co/DAG5COGPyl
Casper HansenThis is a banger post that is spot on. Entropix is like the antidote to how academics work. They just ship and keep experimenting openly, not promising anything but showing exciting examples of what it can enable https://t.co/CbErZnnr8L
Teortaxes▶️This and @_xjdr responses (https://t.co/0Z7BU9nf7s), I think, excellently address community skepticism about entropix (as a comprehensive research program. If you wanted a production grade model agnostic sampler to ship with your llama.cpp wrapper, well yeah, not today buddy) https://t.co/5pUkW6vQxu
Additional media



