
Jina AI has announced the release of Jina-ColBERT-v2, a multilingual late interaction retriever designed to enhance retrieval capabilities across 89 major languages. The new model boasts a 6.5% improvement in retrieval performance compared to the original ColBERT-v2 from Stanford NLP and a 5.4% improvement over Jina's previous version, jina-colbert-v1-en. Jina-ColBERT-v2 offers user-controlled output dimensions and supports a token length of 8192, making it a versatile tool for embedding and reranking tasks.
Jina-ColBERT-v2 is here. https://t.co/CIG6iHYew1 Superior retrieval performance vs the original ColBERT-v2 from @stanfordnlp (+6.5%) & our previous jina-colbert-v1-en(+5.4%). Multilingual support for 89 languages and programming languages. User-controlled output embedding sizesโฆ
๐@JinaAI_ just released ๐ฑ๐๐๐ ๐ช๐๐๐ฉ๐ฌ๐น๐ป ๐2, a Multilingual Late Interaction Retriever for Embedding and Reranking. The new model supports 89 languages with superior retrieval performance, user-controlled output dimensions, and 8192 token-length. You can start usingโฆ https://t.co/zJiXF3GrtQ
spent this summer with my colleague @Robro612 on jina-colbert-v2. at @JinaAI_ If you like ColBERT, and you're working on languages other than English, should give a try (cc @lateinteraction ): 1. It supports 89 major languages, we get some improvement over the previous coolโฆ



