Nomic AI has announced the release of Nomic Embed Text V2, a general-purpose Mixture-of-Experts (MoE) embedding model. This model boasts state-of-the-art performance on the multilingual MIRACL benchmark for its size and supports over 100 languages. It is open source, with all training data, weights, and code available under the Apache 2.0 License. The new model features 305 million active parameters during inference and a maximum sequence length of 512 tokens. The release has been well-received in the AI community, with various experts expressing enthusiasm about its capabilities and potential applications in agentic document workflows.
New release from our open source team @Haystack_AI with powerful upgrades for agentic pipelines! https://t.co/Utir5zVEa8
Hooray, Nomic finally released a new text embedding model, I can't wait to check it out! "Active Parameters During Inference: 305M" "Maximum Sequence Length: 512 tokens" 😭 https://t.co/lXn6uibHwN
A great embedding model is central to high-quality Agentic Document Workflows, so we're delighted to see this latest work from @nomic_ai! https://t.co/pezsylHNpH