Meta Platforms Inc. has released eight new AI research artifacts for non-commercial use, including Meta Spirit LM, its first open-source multimodal language model capable of freely mixing text and speech inputs and outputs. Unveiled at META FAIR, Meta's event focused on advancing machine intelligence (AMI), Spirit LM can detect and reflect emotional cues such as anger, surprise, and joy through its Expressive variant. Unlike existing AI voice systems that rely on separate speech recognition and text synthesis components, Meta Spirit LM integrates text and speech processing within a single model, enabling more natural and expressive interactions. Additionally, Meta released an updated Meta Segment Anything Model 2.1, offering improved results on visually similar objects, small objects, and occlusion handling. These releases underscore Meta's commitment to open science and accelerating innovation in artificial intelligence.
Open science is at the core of accelerating innovation, and it’s great to see @Meta taking steps toward Advanced Machine Intelligence (AMI) with the latest releases at META FAIR. Here’s an overview of the new models and research artifacts: Meta Spirit LM for speech-text… https://t.co/iBvtASL7mt
Meta debuts Spirit LM, its first open-source multimodal language model capable of integrating text and speech inputs and outputs, for non-commercial use only (@carlfranzen / VentureBeat) https://t.co/idixdTbsYQ 📫 Subscribe: https://t.co/OyWeKSRpIM https://t.co/N3uNGaDIXR
Meta Introduces Spirit LM open source model that combines text and speech inputs/outputs: Spirit LM Expressive incorporates emotional cues into its speech generation and can detect and reflect anger, surprise, or joy. https://t.co/gCVSZx1qIo #AI #Business