general compute bets on sambanova chips for faster ai inference

source: techcrunch ai: has the hunt for ai compute uncovered the next cerebras?

level: business

general compute, a new inference neocloud, raised a $15 million seed round at a $60 million post-money valuation led by fuse vc. the company rents out ai processing power for running models, not training them. it plans to use specialized chips from sambanova, an intel-backed chipmaker, instead of gpus. sambanova's upcoming sn50 chips claim to outperform gpus and other specialized chips, generating 600 to 700 tokens per second versus about 250 for gpus. general compute has $300 million of these chips on order and says it will be the first neocloud to deploy them.

the chips are air-cooled and use less power, so they can fit into existing data centers without new infrastructure. general compute is pursuing colocation deals with data center providers and crypto miners looking to repurpose their facilities. the company launched its cloud offering last week, claiming it is the fastest at running the minimax 2.7 open-source llm. investor joe hasselmann sees parallels to coreweave's relationship with nvidia and groq's chip-cloud pairing, noting that sambanova is betting on general compute as a growth channel.

inference clouds like general compute bet on a future with many models and agents, where speed and cost are key. faster inference can turn hour-long coding agent tasks into minutes and make audio agents for customer service more economical. ceo finn puklowski said agent-to-agent interactions need higher speeds than human reading. the $113 million series b for openrouter this week reflects demand for multi-model access to optimize token spending, where speed affects both price and capability.

why it matters: faster, cheaper inference chips could enable more responsive ai agents and reduce operational costs for ai services.

source: techcrunch ai: has the hunt for ai compute uncovered the next cerebras?