source: kdnuggets: local agentic programming on the cheap: claude code + ollama + gemma4
level: technical
this article walks through setting up a local agentic coding environment with ollama, gemma 4, and claude code. the stack lets you run multi-agent workflows that read files, write patches, and run tests without sending code to third-party servers. gemma 4 26b moe activates only 3.8 billion parameters per forward pass and scores 77.1% on livecodebench v6 and 86.4% on τ2-bench for agentic tool use. the previous gemma 3 27b scored 6.6% on the same tool use benchmark, showing a major improvement in reliable tool calling.
the setup starts with installing ollama and pulling the gemma 4 model. a custom modelfile overrides ollama's default 4k context window to 64k tokens, sets a low temperature of 0.2 for stable tool calls, and adds a system prompt that reinforces coding agent behavior. claude code is then wired to the local endpoint through environment variables in a settings.json file. key variables include anthropic_base_url set to http://localhost:11434, anthropic_auth_token set to any non-empty string, and anthropic_model set to the custom modelfile variant. disabling experimental betas prevents header-related errors.
a verification script checks ollama health, model availability, basic api calls, and tool calling before using the stack on real code. the script confirms the model can produce valid tool_use blocks, which is critical for claude code to read files and execute commands. the article notes that without the context override, claude code loses track of file contents mid-edit and produces fragmented changes. the gemma 4 family is released under apache 2.0, removing legal ambiguity for commercial use that existed in previous versions.
why it matters: this setup lets developers run agentic coding workflows locally, avoiding api costs and keeping proprietary code private.
source: kdnuggets: local agentic programming on the cheap: claude code + ollama + gemma4