Sources
AKExtending Llama-3's Context Ten-Fold Overnight We extend the context length of Llama-3-8B-Instruct from 8K to 80K via QLoRA fine-tuning. The entire training cycle is super efficient, which takes 8 hours on one 8xA800 (80G) GPU machine. The resulted model exhibits superior https://t.co/SZtBAm1F6b
Kirk Borne💯🌟🚀@abacusai announces 128K long-context support for Llama-3-70B, putting it on par with #GPT4 regarding context length support and real-world usage. See post below from @bindureddy -and- check out the model here: https://t.co/mGxMv8uPz6 ⬅️❣️ https://t.co/wtlsVTKvhP
elvisNice to have 128K long-context support for Llama 3 70B already. It will be interesting to see how far these open models can get when extended with a bigger context. This seems to be an early release but curious to hear if anyone is building on top of these long-context open… https://t.co/Iz8OW7dyEv
Additional media





