Recent studies have introduced several innovative models in the field of molecular generation and optimization. A notable development is the hierarchical molecular graph encoder within a multimodal large language model (MLLM), which allows the integration of multi-level graph features, enhancing molecular representation. Another significant contribution is the MolGen-Transformer, an open-source self-supervised model that achieves 100% reconstruction accuracy using SELFIES (SELF-referencing Embedded Strings), ensuring the chemical validity of generated molecules. Additionally, the GP-MOLFORMER model focuses on molecular optimization through a generative chemical language model (CLM) utilizing transformer architecture. The GGFlow model enhances molecular graph generation by integrating flow matching and optimal transport, leading to improved training stability and sampling. Furthermore, GAFL (Geometric Algebra Flow Matching) has been introduced for designing protein backbones using projective geometric algebra. Lastly, studies are evaluating the alignment between large language models and protein-specific geometric deep models to improve protein representation.
Distilling structural representations into protein sequence models https://t.co/CBxpR90kLb #biorxiv_bioinfo
Leveraging High-throughput Molecular Simulations and Machine Learning for Formulation Design #machinelearning #compchem https://t.co/Ruu1yAUGtB
Generating Highly Designable Proteins with Geometric Algebra Flow Matching. https://t.co/pQvSoSm0HX