run github ci on hugging face jobs

source: hugging face blog: migrating your github ci to hugging face jobs

level: technical

github actions default to ubuntu-latest runners, which are convenient but limited. they can be slow, lack gpu access, and are generic. for the trackio project, these limits became a problem. the team needed reliable cpu ci and gpu tests on cuda hardware. they built a bridge that keeps github actions in control but runs jobs on hugging face jobs. this cut cpu ci time by about 30% and enabled a new gpu test suite.

the setup uses a dispatcher space that receives github webhooks and launches ephemeral self-hosted runners inside hugging face jobs. when a pull request triggers a workflow with a special runs-on label like hf-jobs-cpu-upgrade, the dispatcher starts a job on matching hardware. the job boots a runner, executes the ci steps, and exits. from github's view, it is a self-hosted runner. from hugging face's view, it is a container running workflow commands.

to adopt this, duplicate the dispatcher space, create a github app, set secrets, and change runs-on in workflow files. for cpu, using a tailored docker image like microsoft playwright sped up tests. for gpu, a t4-small job ran in 45 seconds at low cost. logs are accessible via cli, and volumes can mount datasets. the approach is practical for ml projects needing custom images or accelerators.

why it matters: it gives ai and data science projects a simple way to run ci on gpu hardware without maintaining always-on runners, reducing test time and cost.

source: hugging face blog: migrating your github ci to hugging face jobs