Inspired by the effectiveness of genetic algorithms and the importance of synthesizability in molecular design, we present SynGA, a simple genetic algorithm that operates directly over synthesis routes. Our method features custom crossover and mutation operators that explicitly constrain it to synthesizable molecular space. By modifying the fitness function, we demonstrate the effectiveness of SynGA on a variety of design tasks, including synthesizable analog search and sample-efficient property optimization, for both 2D and 3D objectives. Furthermore, by coupling SynGA with a machine learning-based filter that focuses the building block set, we boost SynGA to state-of-the-art performance. For property optimization, this manifests as a model-based variant SynGBO, which employs SynGA and block filtering in the inner loop of Bayesian optimization. Since SynGA is lightweight and enforces synthesizability by construction, our hope is that SynGA can not only serve as a strong standalone baseline but also as a versatile module that can be incorporated into larger synthesis-aware workflows in the future.
Co-authors: Alston Lo, Connor W. Coley, Wojciech Matusik Learn more
Diffusion large language models (dLLMs) are emerging as an efficient alternative to autoregressive models due to their ability to decode multiple tokens in parallel. However, aligning dLLMs with human preferences or task-specific rewards via reinforcement learning (RL) is challenging because their intractable log-likelihood precludes the direct application of standard policy gradient methods. While prior work uses surrogates like the evidence lower bound (ELBO), these one-sided approximations can introduce significant policy gradient bias. To address this, we propose the Sandwiched Policy Gradient (SPG) that leverages both an upper and a lower bound of the true log-likelihood. Experiments show that SPG significantly outperforms baselines based on ELBO or one-step estimation. Specifically, SPG improves the accuracy over state-of-the-art RL methods for dLLMs by 3.6% on GSM8K, 2.6% on MATH500, 18.4% on Countdown, and 27.0% on Sudoku.
Co-authors: Chenyu Wang, Paria Rashidinejad, DiJia Su, Song Jiang, Sid Wang, Siyan Zhao, Cai Zhou, Shannon Zejiang Shen, Feiyu Chen, Tommi Jaakkola, Yuandong Tian, Bo Liu Learn more
The performance of flow matching and diffusion models can be greatly improved at inference time using reward adaptation algorithms, yet efficiency remains a major limitation. While several algorithms were proposed, we demonstrate that a common bottleneck is the sampling method these algorithms rely on: many algorithms require to sample Markov transitions via SDE sampling, which is significantly less efficient and often less performant than ODE sampling. To remove this bottleneck, we introduce GLASS Flows, a new sampling paradigm that simulates a ''flow matching model within a flow matching model'' to sample Markov transitions. As we show in this work, this ''inner'' flow matching model can be retrieved from any pre-trained model without any re-training, effectively combining the efficiency of ODEs with the stochastic evolution of SDEs. On large-scale text-to-image models, we show that GLASS Flows eliminate the trade-off between stochastic evolution and efficiency. GLASS Flows improve state-of-the-art performance in text-to-image generation, making it a simple, drop-in solution for inference-time scaling of flow and diffusion models.
Co-authors: Peter Holderrieth, Uriel Singer, Tommi Jaakkola, Ricky T. Q. Chen, Yaron Lipman, Brian Karrer Learn more
Abstract Motivation
Protein structure generative models have seen a recent surge of interest, but meaningfully evaluating them computationally is an active area of research. While current metrics have driven useful progress, they do not capture how well models sample the design space represented by the training data. We argue for a protein Frechet Inception Distance (FID) metric to supplement current evaluations with a measure of distributional similarity in a semantically meaningful latent space.
Results
Our FID behaves desirably under protein structure perturbations and correctly recapitulates similarities between protein samples: it correlates with optimal transport distances and recovers FoldSeek clusters and the CATH hierarchy. Evaluating current protein structure generative models with FID shows that they fall short of modeling the distribution of PDB proteins.
Availability
Code is available at: https://github.com/ffaltings/protfid
Co-authors: Felix Faltings, Hannes Stark, Tommi Jaakkola, Regina Barzilay Learn more
Gastrulation, a critical developmental stage involving germ layer specification and axes formation, is a major point of failure in human development, contributing to pregnancy loss and congenital malformations. However, due to ethical constraints and anatomical differences in animal models, the failure modes underlying human gastrulation remain poorly understood. To elucidate these failure modes, we introduce FATE-MAP (Failure Analysis and Trajectory Evaluation via Mechanistic-AI Prediction), an integrated platform that combines high-throughput perturbations of human 2D gastruloids with quantitative phenotypic mapping, predictive deep learning, and mechanistic morphogen modeling. Analyzing over 2000 drug-treated human 2D gastruloids, we mapped a phenotypic morphospace that separates canonical patterning, in which primitive-streak fates are correctly specified and radially organized, from failure modes, defined as departures from this organization and marked by a loss of a required fate and/or radial symmetry. To predict and interpret patterning outcomes, FATE-MAP combines a transformer linking chemical structure to phenotype with PDE simulations of morphogen transport and cell fate specification, and projects both outputs onto the experimentally defined morphospace. Applying this framework, we flagged two clinical molecules as potential teratogens and identified two parameters, cell density and SOX2 stability, that form orthogonal morphospace axes along which canonically patterned gastruloids systematically vary. FATE-MAP thus provides a roadmap for decoding human developmental trajectories and accelerating safe therapeutic discovery.
Co-authors: Joseph Rufo, Chongxu Qiu, Dasol Han, Naomi Baxter, Gabrielle Daley, Jasmine Dhillon, Felix Wong, James J. Collins & Maxwell Z. Wilson Learn more
Diffusion models can be improved with additional guidance towards more effective representations of input. Indeed, prior empirical work has already shown that aligning internal representations of the diffusion model with those of pre-trained models improves generation quality. In this paper, we present a systematic framework for incorporating representation guidance into diffusion models. We provide alternative decompositions of denoising models along with their associated training criteria, where the decompositions determine when and how the auxiliary representations are incorporated. Guided by our theoretical insights, we introduce two new strategies for enhancing representation alignment in diffusion models. First, we pair examples with target representations either derived from themselves or arisen from different synthetic modalities, and subsequently learn a joint model over the multimodal pairs. Second, we design an optimal training curriculum that balances representation learning and data generation. Our experiments across image, protein sequence, and molecule generation tasks demonstrate superior performance as well as accelerated training. In particular, on the class-conditional ImageNet 256 x 256 benchmark, our guidance results in 23.3 times faster training than the original SiT-XL as well as four times speedup over the state-of-the-art method REPA.
Co-authors: Chenyu Wang, Cai Zhou, Sharut Gupta, Johnson Lin, Stefanie Jegelka, Stephen Bates, Tommi Jaakkola Learn more
In this paper we introduce Hierarchical Diffusion Language Models (HDLM) -- a novel family of discrete diffusion models for language modeling. HDLM builds on a hierarchical vocabulary where low-level tokens with detailed semantics are surjectively mapped to high-level tokens with coarse-grained meanings. In the forward process, each token is independently perturbed to its higher-level ancestor with more abstract semantics according to the scheduler, while in the reverse process the model progressively predicts the next, more detailed semantics. Taken together, HDLM provides a general time-varying next semantic scale prediction process for language modeling. We derive closed-form expressions for the diffusion Evidence Lower Bound (ELBO), and show that HDLM can be implemented in a flexible manner while including the existing MDLM as a special case. We also propose practical training techniques based on the insights. Extensive text generation experiments validate the effectiveness of HDLM, which demonstrates consistently lower validation and generative perplexity than baselines.
Co-authors: Cai Zhou, Chenyu Wang, Dinghuai Zhang, Shangyuan Tong, Yifei Wang, Stephen Bates, Tommi Jaakkola Learn more
Progress and potential
Blending polymers is a cost-effective strategy to develop functional materials using existing components, yet the design space is vast, and traditional trial-and-error approaches are inefficient. In this work, we introduce an autonomous, data-driven workflow integrated with a robotic platform for discovering functional random heteropolymer blends. This system successfully identified blends that outperform their individual components in protein stabilization. While previous efforts have focused primarily on the monomer composition of random heteropolymers, our results highlight the potential to make discoveries from complex polymer blend systems. This methodology could be generalized to other material discovery campaigns, from optimizing electrolytes for batteries to improving drug excipient combinations. The dataset released with this study also provides a valuable resource for advancing polymer informatics in blend design.
Highlights
• A data-driven robotic platform was developed to discover functional polymer blends
• The platform enabled efficient optimization from high-dimensional blending spaces
• Blends of random heteropolymers can outperform individual components in function
• Segment-level features correlated with improved protein stabilization
Contributors: Guangqi Wu, Tianyi Jin, Alfredo Alexander-Katz, Connor Coley
Learn more
Designing new enzymes typically begins with idealized arrangements of catalytic functional groups around a reaction transition state, then attempts to generate protein structures that precisely position these groups. Current AI-based methods can create active enzymes but require predefined residue positions and rely on reverse-building residue backbones from side-chain placements, which limits design flexibility. Here we show that a new deep generative model, RoseTTAFold diffusion 2 (RFdiffusion2), overcomes these constraints by designing enzymes directly from functional group geometries without specifying residue order or performing inverse rotamer generation. RFdiffusion2 successfully generates scaffolds for all 41 active sites in a diverse benchmark, compared to 16 using previous methods. We further design enzymes for three distinct catalytic mechanisms and identify active candidates after experimentally testing fewer than 96 sequences in each case. These results highlight the potential of atomic-level generative modeling to create de novo enzymes directly from reaction mechanisms.
Contributors:Woody Ahern, Jason Yim, Doug Tischer, Saman Salike, Seth M. Woodbury, Donghyo Kim, Indrek Kalvet, Yakov Kipnis, Brian Coventry, Han Raut Altae-Tran, Magnus S. Bauer, Regina Barzilay, Tommi S. Jaakkola, Rohith Krishna, David Baker Learn more
We introduce BoltzGen, an all-atom generative model for designing proteins and peptides across all modalities to bind a wide range of biomolecular targets. BoltzGen builds strong structural reasoning capabilities about target-binder interactions into its generative design process. This is achieved by unifying design and structure prediction, resulting in a single model that also reaches state-of-the-art folding performance. BoltzGen’s generation process can be controlled with a flexible design specification language over covalent bonds, structure constraints, binding sites, and more. We experimentally validate these capabilities in a total of eight diverse wetlab design campaigns with functional and affinity readouts across 26 targets. The experiments span binder modalities from nanobodies to disulfide-bonded peptides and include targets ranging from disordered proteins to small molecules. For instance, we test 15 nanobody and protein binder designs against each of nine novel targets with low similarity to any protein with a known bound structure. For both binder modalities, this yields nanomolar binders for 66% of targets. We release model weights, data, and both inference and training code at: https://github.com/HannesStark/boltzgen.
Co-authors: Hannes Stark, Felix Faltings, MinGyu Choi, Yuxin Xie, Eunsu Hur,
Timothy O’Donnell, Anton Bushuiev, Talip Uçar, Saro Passaro, Weian Mao, Mateo Reveiz, Roman Bushuiev, Tomáš Pluska, Josef Sivic, Karsten Kreis, Arash Vahdat, Shamayeeta Ray, Jonathan T. Goldstein, Andrew Savinov, Jacob A. Hambalek, Anshika Gupta, Diego A. Taquiri-Diaz, Yaotian Zhang, A. Katherine Hatstat, Angelika Arada, Nam Hyeong Kim, Ethel Tackie-Yarboi, Dylan Boselli, Lee Schnaider, Chang C. Liu, Gene-Wei Li, Denes Hnisz, David M. Sabatini, William F. DeGrado, Jeremy Wohlwend, Gabriele Corso, Regina Barzilay, Tommi Jaakkola Learn more