In celebration of National Cat Day, the team behind ThunderKittens has announced significant updates to their AI framework, now version 0.1. This release features enhancements aimed at improving speed and simplicity, making it easier for developers to write AI kernels. Key improvements include a 10-40% speed increase for attention backwards, CuBLAS-speed GEMMs, 8x faster state space models, and 14x faster linear attentions, with an average performance of less than 200 milliseconds. The updates were developed by a collaborative team, including contributors from Hazy Research, and are designed to make ThunderKittens more user-friendly while maintaining its performance capabilities. The new version is expected to accelerate the development of generative AI applications.
ThunderKittens goes brrr... TK arXiv is now out, plus new kernels for attention, Mamba-2, FlashFFTConv, linear attention, and more! Led by @bfspector, @simran_s_arora, @AaryanSinghal4 - check it out! Paper: https://t.co/C77zhwrSeZ Blog: https://t.co/0brR6v0O3s (+Kitten tax) https://t.co/dlH9ND6KnQ https://t.co/xIrllQQjkA
Wish writing AI kernels was like writing PyTorch??? Enter ThunderKittens 0.002: for simpler, faster, more adorable AI kernels! We use TK to provide 10-40% faster attention backwards, CuBLAS-speed GEMMs, 8x faster state space models, 14x faster linear attentions – averaging <200… https://t.co/5WMEIfdPrb
Thrilled to be unveiling Thunderkittens 0.1: now with more simplicity, more speed, and more kernels! Had a blast working on this with @bfspector, @simran_s_arora, and folks at @HazyResearch 🚀 https://t.co/6FiUtVBaDT https://t.co/odHdcvAVeM