Summary

Cactus Compute published Needle, a 26M-parameter function-calling model distilled from Gemini 3.1 that is positioned for local fine-tuning and very small-device deployment. The project frames compact tool use as an alternative to relying only on larger cloud models for agent workflows.

What changed

Cactus Compute released Needle, an open 26M parameter function-calling model distilled from Gemini 3.1.

Why it matters

Tool calling is usually associated with larger hosted models, so Needle is a meaningful signal that agent-capable behavior can move further downmarket in cost, latency, and hardware requirements. That makes local and embedded agent workflows more plausible for developers who cannot assume frontier-model pricing or always-on cloud access.

Evidence excerpt

The Needle repository says the team distilled Gemini 3.1 into a 26M parameter Simple Attention Network that can be fine-tuned locally and used for single-shot function calling.

Sources