Sources
Darek KłeczekVery cool technical report on Qwen2! > Focus on multilingual capabilities in 30 languages including pretraining data, tokenizer (151k vocab) and evaluations > Dual Chunk Attention with YARN for long context > Ablations showing no significant gain after 7 trillion pretraining…
Bill Yuchen Lin 🤖Qwen 2’s tech report is out! However, their high-quality data remains private. Want to use Qwen 2’s data for post-training research and build other cool projects? 📢 Good news! Using our 🐦⬛ Magpie method, we have extracted a large collection of instruction data from Qwen 2 and…
InferlessElevate your text generation with Qwen-2 72B and deploy on our Inferless serverless platform🚀 ⚡ Experience superb efficiency: 🔹17.83 tokens/sec average generation speed 🔹24.79 sec latency for 512 tokens 🔹35.59 seconds average cold start time 🔗Link: https://t.co/YG1x4yYBtm https://t.co/4Mb2uJKTsK













