Anthropic has upgraded its Claude Sonnet 4 large language model to support a 1 million token context window via the Anthropic API, representing a fivefold increase from its previous 200,000 token limit. This enhancement allows the model to process approximately 750,000 words or 75,000 lines of code in a single request, enabling it to handle entire codebases or dozens of research papers simultaneously. The update is currently available in public beta on the Anthropic API and Amazon Bedrock platforms, with plans for broader rollout including Claude Code and the website interface. Pricing for input/output tokens up to 200,000 remains at $3/$15 per million tokens, but increases to $6 per million input tokens and $22.50 per million output tokens beyond that threshold. Industry observers note that this expansion could have a considerable impact on AI coding capabilities. Anthropic is expected to extend the 1 million token context feature to existing Claude Max users, potentially as a free upgrade. The upgrade positions Claude Sonnet 4 among the models with the longest context windows currently available.
It's crazy to remember that as recently as January I was still struggling to get much useful work out of LLMs. Now I use them more or less constantly throughout the day. The transition was Claude 3.7 (in Feb) and Gemini 2.5 Pro (in March). So recent!
It's crazy to remember that as recently as January I was still struggling to get much useful work out of LLMs. Now I use them more or less constantly throughout the day. The transition was Claude 2.7 (in February) and Gemini 2.5 Pro (in March). So recent!
It's about time we build an easier UX for llama.cpp by @ggml_org 🤗 I've used llama.cpp for better part of last 2 years for playing with LLMs and use it in production too Whilst it takes a bit to setup llama.cpp, once done, it *just* works! Come along with your ideas/ https://t.co/GOiWv3bnZB