Nov 23, 04:16 PM

Apple Launches AIMv2 Vision Encoders and CoreML Models, 4.8x Faster and 2.8x Smaller Than Competitors

Apple has launched AIMv2, a new family of open-set vision encoders aimed at enhancing multimodal understanding and object recognition tasks. The AIMv2 models draw inspiration from existing frameworks like CLIP and incorporate autoregressive techniques to improve performance. Additionally, Apple introduced CoreML models that reportedly match the zero-shot performance of OpenAI's ViT-B/16 while being 4.8 times faster and 2.8 times smaller. The new models are available in an iOS app, allowing users to run them directly on iPhones. Performance tests on various quantization levels of Apple MLX models showed significant speed variations, with 3bit models achieving 29.00 tokens per second and 8bit models reaching 13.68 tokens per second. These advancements reflect Apple's ongoing commitment to innovation in artificial intelligence and machine learning technologies.

#Apple #AIMv2 #CoreML #OpenAI #iOS #Apple MLX

Written with ChatGPT (GPT-4o mini).