source: google deepmind: introducing computer use in gemini 3.5 flash
level: technical
google deepmind has added computer use as a built-in tool in gemini 3.5 flash. previously a standalone model, this capability is now part of the main flash model. developers can use it to create agents that interact with graphical interfaces across platforms. the model already supports function calling and tools like search and maps grounding. with computer use, it can handle long tasks such as continuous software testing and knowledge work in professional apps.
the integration aims to improve reliability for enterprise automation. gemini 3.5 flash can analyze its own app to categorize features and audit documentation for accessibility issues. to address safety, the model uses adversarial training against prompt injection. two optional enterprise safeguards are available: one requires user confirmation for sensitive actions, and another stops tasks if indirect prompt injection is detected. google recommends combining these with sandboxing, human verification, and access controls.
developers can access computer use through the gemini api and gemini enterprise agent platform. a demo environment is hosted by browserbase for testing. early customers report value in using the feature for automation tasks. the release marks a step toward more capable agents that can operate across different software environments without custom integrations.
why it matters: it simplifies building ai agents that automate complex, multi-step tasks across different software, reducing the need for custom integrations and improving reliability for enterprise workflows.
source: google deepmind: introducing computer use in gemini 3.5 flash