Stability AI, in collaboration with Arm and researchers from UC San Diego, has launched Stable Audio Open Small, a compact text-to-audio generation model with 341 million parameters. This model is designed to run efficiently on Arm CPUs, which power approximately 99% of smartphones, enabling users to generate sound effects directly on their mobile devices without the need for GPUs. The model can produce 11 seconds of audio in under 8 seconds, offering fast and royalty-free sound generation. The release also features a novel adversarial relativistic-contrastive (ARC) post-training technique that enhances the speed, diversity, and efficiency of text-to-audio generation across various devices. Stable Audio Open Small is free to use and supports quick fine-tuning, marking a significant advancement in mobile audio AI capabilities.
哇,Stability AI 和 Arm 刚刚发布了 Stable Audio Open Small。 这个 AI 是一个拥有 3.41 亿参数的音频模型,能够在 Arm CPU 上运行 8 秒钟🤯 现在你的手机可以自己生成音效 👇 https://t.co/pEYeh0jnkp
``SpecWav-Attack: Leveraging Spectrogram Resizing and Wav2Vec 2.0 for Attacking Anonymized Speech,'' Yuqi Li, Yuanzhong Zheng, Zhongtian Guo, Yaoxuan Wang, Jianjun Yin, Haojun Fei, https://t.co/uworiKc9IE
``LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2,'' Jongmin Jung, Dasaem Jeong, https://t.co/TMwdRT7l4r