Feb 19, 06:56 AM

DeepSeek AI Unveils Native Sparse Attention Technology, Achieving Up to 11.6x Faster AI Model Performance with R1 Model Surpassing ChatGPT

DeepSeek AI has introduced a new technology called Native Sparse Attention (NSA), designed to enhance the efficiency of long-context artificial intelligence models. This hardware-aligned and natively trainable mechanism allows for ultra-fast training and inference, achieving up to 11.6 times faster decoding, 9.0 times faster forward pass, and 6.0 times faster backward pass compared to traditional methods. The introduction of NSA is seen as a potential game-changer in the AI landscape, particularly as it allows for optimal utilization of modern computing hardware. Experts suggest that DeepSeek's advancements signal a significant shift in China's innovation capabilities within the global AI sector. The startup's R1 model has reportedly rivaled OpenAI’s offerings at a lower cost, briefly surpassing ChatGPT on the App Store, which has prompted a reassessment among U.S. tech giants. As competition intensifies in the AI field, DeepSeek’s emergence underscores China's growing role in cutting-edge technology development.

#DeepSeek #Native Sparse Attention #NSA #China #OpenAI #ChatGPT #App Store

Written with ChatGPT (GPT-4o mini).