source: arxiv machine learning: molecular lead optimization via agentic tool planning

level: research

drug discovery takes years and costs billions, with lead optimization being a key stage where early hit compounds are refined into drug candidates. this step involves improving absorption, distribution, metabolism, excretion, and toxicity (admet) properties through small structural changes, while preserving the parts of the molecule that bind to the disease target. most ai methods for this task only make one change at a time, ignoring how a series of modifications can affect the final outcome.

researchers from multiple institutions introduced trace, a trajectory-aware agent that uses a large language model (llm) to plan sequences of molecular edits. instead of single-step optimization, trace treats tool selection as a sequential decision problem. given a starting molecule and a target property profile, the agent reasons about which computational tools to apply and in what order, considering long-term effects. it can call tools for tasks like property prediction, molecular generation, and filtering.

in experiments on standard benchmarks, trace outperformed existing one-step and multi-step baselines in optimizing admet properties while maintaining structural similarity to the original lead. the agent successfully balanced multiple objectives, such as improving solubility and reducing toxicity, without destroying the core scaffold needed for target binding. the approach shows how llm-based reasoning can guide complex, multi-step scientific workflows.

why it matters: this method could speed up drug development by automating the tedious, multi-step process of refining molecules, helping chemists focus on the most promising candidates.


source: arxiv machine learning: molecular lead optimization via agentic tool planning