For more Triton than @lantiga and I fit into the ⛈️ Thunder Sessions ⛈️ yesterday, check out @hsu_byron 's talk on CUDA-MODE today. For a primer or recap for how Triton kernels work, check out our recording form yesterday: https://t.co/rrUP2TCrRo https://t.co/T3NyetIViX
CUDA-MODE 28: Liger-Kernel Today @hsu_byron will present LinkedIn's open-source collection of Triton kernels for efficient LLM training. Aug 31, 7 PM UTC Event: https://t.co/BJXLsVce0X https://t.co/1e7bGSghYC
Final @thursdai_pod of the summer, we had tons of amazing guests + over 2.2K audience (pod links in 1st comment) ⚡ Fastest LLM inference in the world w/ @draecomino @bytegorilla from @CerebrasSystems 👀 Qwen-2 VL SOTA VLM w/ @JustinLin610 from @Alibaba_Qwen 📖… https://t.co/kszhkDV6ot




The launch of TRL v0.10.1 introduces significant enhancements in large language model (LLM) training and performance. Key features include an online DPO by Google DeepMind aimed at improving LLM alignment, and the integration of LinkedIn's Liger kernel, which reportedly increases training speed by 20% and reduces memory usage by 60%. The Liger kernel is a collection of Triton kernels designed specifically for LLM training. Additionally, upcoming events include vLLM Office Hours on September 5, where Tyler Smith from Neural Magic will discuss high-performance inference using NVIDIA CUTLASS. The Thunder Sessions livestream on August 30 will showcase how Thunder, a source-to-source compiler, can enhance AI model performance by 40%. Furthermore, a presentation on LinkedIn's Liger kernel is scheduled for August 31 at 7 PM UTC.