level: research
large language model agents often react step by step without imagining what might happen next. this limits them in tasks that need many steps. humans naturally think ahead, running mental simulations before deciding. the paper proposes a way to give agents this ability by training a single model to generate both a predicted future state and a success score for a plan, similar to a q-value in reinforcement learning.
simply fine-tuning on examples of foresight does not work well. agents learn to mimic the format of thinking ahead but lack real predictive accuracy. the authors call this the format-capability gap. to fix it, they design a three-stage training process. first, world model agentic mid-training injects latent prediction skills. second, format-eliciting supervised fine-tuning teaches the model to express these predictions in text. third, reinforcement learning refines the behavior using the internal success estimates.
experiments show the trained agent outperforms standard methods on complex, multi-step tasks. the internal world model helps it avoid dead ends and choose better actions. the approach unifies planning and execution in one model, removing the need for separate components. this makes the agent more efficient and easier to deploy. the work suggests that giving language agents a genuine internal simulation ability is key to more human-like reasoning in sequential decisions.
why it matters: this method could make ai assistants more reliable in tasks like scheduling, coding, or robotics by letting them mentally test plans before acting.