agent builds 3d paris gallery by chaining two spaces

source: hugging face blog: how an agent built a 3d paris gallery by chaining two hugging face spaces

level: technical

a coding agent built a 3d gallery of paris monuments without manual image generation or 3d reconstruction. it used two hugging face spaces: one for creating images and another for turning them into 3d gaussian splats. the agent read each space's agents.md file to learn how to call them, then chained the outputs together. the result is a static space with a cinematic viewer showing rotating monuments.

the process relied on the building-block economy concept, where ai agents assemble small, documented components instead of building from scratch. hugging face spaces now expose an agents.md file that tells agents exactly how to use them via api calls. this removes integration barriers like sdks and gpu setup. the agent generated six monument images on black backgrounds, reconstructed them as 3d splats, fixed orientation, compressed files, and built a three.js viewer with scroll and drag controls.

the only human input was high-level feedback on aesthetics, like zoom level and transition timing. the agent handled technical details, including flipping y-down outputs and compressing files for fast loading. this shows how models from different organizations become composable through documented, callable blocks. the entire pipeline is reproducible, with scripts available in the space repository.

why it matters: this approach lets ai agents chain state-of-the-art models without custom integration, turning complex multimedia pipelines into simple, repeatable steps.

source: hugging face blog: how an agent built a 3d paris gallery by chaining two hugging face spaces