














It's All About Your Sketch: Democratising Sketch Control in Diffusion Models TLDR: This paper explores using simple sketches to create realistic images in AI, enabling anyone to generate accurate pictures just by sketching. ✨ Interactive paper: https://t.co/3L0TwKsyZT
Generating Illustrated Instructions TLDR: A new model called StackedDiffusion can generate customized visual instructions based on text input, outperforming other models and even human-generated articles in 30% of cases. ✨ Interactive paper: https://t.co/KqQJKwCOaj
Beyond Text: Frozen Large Language Models in Visual Signal Comprehension TLDR: Researchers developed a tool called V2L Tokenizer that helps a computer understand pictures like learning a new language. ✨ Interactive paper: https://t.co/soN6ShxDTz

Researchers have made significant advancements in AI and machine learning, particularly in the fields of image and 3D object generation. Notable developments include the introduction of Kolors, a model trained by Kwai using the SDXL-arch architecture and GLM-4 LLM as a text encoder, which has shown promising results in photorealistic text-to-image synthesis. Another key innovation is the Self-correcting LLM-controlled Diffusion (SLD) framework, which enhances the accuracy of AI-generated images by automatically correcting mistakes. Additionally, the XCube model can quickly generate detailed 3D objects and scenes using sparse voxel hierarchies, and 3DiffTection improves 3D object detection from single images with pre-trained models. These advancements highlight the rapid progress in AI's ability to understand and generate visual and geometric data.