source: hugging face blog: introducing north mini code: cohere’s first model for developers
level: technical
cohere released north mini code, a 30b-parameter mixture-of-experts model with 3b active parameters, designed for agentic coding. it is available on hugging face under the apache 2.0 license. the model uses a decoder-only transformer with interleaved sliding-window and global attention, and a mixture-of-experts feed-forward block with 128 experts, activating 8 per token. it was trained with a two-stage supervised fine-tuning process followed by reinforcement learning with verifiable rewards, focusing on software engineering and terminal tasks.
the training data included programming, reasoning, and instruction-following examples, with code datasets making up 70% of tokens in the first stage and 61% in the second. the team used over 70,000 verifiable tasks across about 5,000 repositories, deduplicated against swe-bench to avoid leakage. they employed a long-to-longer context approach, using 64k and 128k context lengths. the model was exposed to multiple agent harnesses during training to improve robustness across different tool-use modalities, achieving 61.0% pass@1 on swe-bench verified with mini-swe-agent.
reinforcement learning with verifiable rewards improved pass@1 by 7.9% on terminal-bench v2 and 3.0% on swe-bench verified. the training used an asynchronous setup with a vllm sidecar to decouple sampling from learning, and a windowed fifo queue to handle variable rollout lengths. the model was trained jointly on terminal and software engineering tasks, which yielded better generalization than separate training. human evaluations showed a 66.1% win rate for the final model over the sft-only version, particularly on code editing tasks.
why it matters: this model offers a strong open-source option for building coding agents that work reliably across different tools and environments, reducing the need for custom fine-tuning.
source: hugging face blog: introducing north mini code: cohere’s first model for developers