Equivariant Autoregressive Models for Molecular Generation
Author(s)
Kim, Song Eun
DownloadThesis PDF (5.907Mb)
Advisor
Smidt, Tess E.
Terms of use
Metadata
Show full item recordAbstract
In-silico generation of diverse molecular structures has emerged as a promising method to navigate the complex chemical landscape, with direct applications to inverse material design and drug discovery. However, 3D molecular structure generation comes with several unique challenges; generated structures must be invariant under rotations and translations in 3D space, and must satisfy basic chemical bonding rules. Recently, E(3)-equivariant neural networks that utilize higher-order rotationally-equivariant features have shown improved performance on a wide range of atomistic tasks, including structure generation. Previously, we have developed Symphony, an E(3)-equivariant autoregressive generative model for 3D structures of small molecules. At each sampling iteration, a single focus atom is selected, which is then used to decide on the next atom’s position within its neighborhood. Symphony built on previous autoregressive models by using message-passing with higher-order equivariant features, allowing a novel representation of probability distributions via spherical harmonic signals. Symphony’s performance approached that of state-of-the-art diffusion models while remaining relatively lightweight. However, it continued to face challenges in error accumulation and determining bond lengths, and it was only evaluated against small organic molecules. Here, we expand on Symphony’s capabilities and make it more compatible with larger atomic structures. We add improvements to the embedders, split the radial and angular components when predicting atom positions, and increase the radial cutoff for atomic neighborhoods considered during prediction. We also increase Symphony’s training and inference speeds through a new implementation in PyTorch, making inference nearly 4x faster than previously. In addition, we demonstrate its effectiveness across a variety of tasks, including small molecule and protein backbone generation.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology