Sources
fabianthe reason Anthropic is often at capacity is that they have the best universally useful model, specifically for fast back and forth codegen which uses tons of tokens t/s on glif for Claude also noticably faster in CET mornings vs afternoons when US users come online
Bindu ReddyAnthropic is constantly rate-limited because its API usage has EXPLODED. All the code editors and several agentic systems use Sonnet as their primary model. o1 is a bit too slow even though it's a better model, and GPT-4o has deteriorated and fallen behind Sonnet The new…
theseriousadultAnthropic seems like the only big lab which is perpetually running out of inference compute. I don't hear the same complaints about rate limits and capacity crunches about anyone else. do they really have that much less? maybe sonnet is a much bigger model than 4o.


