
Anthropic has introduced a new feature called 'Prompt Caching' for its Claude AI models, now available in public beta. This feature allows users to cache frequently used context, significantly reducing both costs and latency. Specifically, it can cut API input costs by up to 90% and reduce latency by up to 80% for long prompts. Prompt caching enables the reuse of extensive contexts, such as those the length of a book, across multiple API requests. The cache has a 5-minute lifetime, refreshed each time the cached content is used. This development is expected to enhance the efficiency of large language model applications by decreasing the need for repetitive data input, thus saving developers substantial resources.




this Thursday in AI: 1/ @AnthropicAI now allows you to cache your prompts and it's like giving Claude a giant sticky note. It will save common parts of your prompts for some time so you don’t have to eat up the costs multiple times. How much you ask? API input costs can go down…
Anthropic has announced a new feature called "Prompt Caching" for Claude. This feature allows long prompts to be cached across multiple API calls, enabling the reuse of extensive contexts, such as those the length of a book. As a result, costs can be reduced by up to 90%, and…
Anthropic's new Claude prompt caching will save developers a fortune - VentureBeat https://t.co/gNgA1nKoX4 #AI #ML #ArtificialIntelligence #MachineLearning #GenAI