
Microsoft and other companies have introduced new AI models like Llama 3 and Phi-3, offering improved performance and efficiency. Llama 3 is being fine-tuned for various applications, with faster inference speeds and support for longer context lengths. Groq Inc. highlights Llama 3's speed and capabilities, competing with other advanced models in the market. Developers are exploring the potential of these models for diverse use cases, including data analysis, code interpretation, and long-context tasks.
Quantized matmuls in the latest MLX are up to 40% faster thanks to @angeloskath and @DiganiJagrit QLoRA fine-tuning Llama 3 70B on a single M2 Ultra stats: - Batch size 4 with 16 LoRA layers - 95 toks/sec - Peak mem 41GB - Avg power 120 W https://t.co/dLuMAGrpzd
We just released the first LLama-3 8B with a context length of over 160K onto Hugging Face! SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens, powered by @CrusoeEnergy's compute) by appropriately adjusting RoPE theta. 🔗 https://t.co/Y9kErG8AK0 https://t.co/S0GRepQeAp
Oh my god! It's raining Llama-3 today! Based on an amazing work of @winglian ❤️, I created "Llama-3-8B-Instruct-64k" model! Already being tested, quantized, and uploaded on @huggingface 🚀 Who wants them?! https://t.co/0P0PBYr1bK https://t.co/pWW8WRAXkr
















