Gemini 3 Is Still Making Waves-But Google’s Next AI Models Are Already on the Horizon
The company quietly launched FunctionGemma, a specialised 270-million-parameter AI model built for one big goal:
To make edge applications more reliable.

In simple terms, FunctionGemma is a small but powerful AI model designed to run closer to the user on devices, gateways, or local infrastructure, rather than solely in the cloud. Its primary mission is to solve one of the most painful problems in modern app development: getting consistent, dependable behaviour from AI at the edge.
Unlike general-purpose chatbots, FunctionGemma is engineered for a single, critical utility: translating natural language user commands into structured code that apps and devices can actually execute, all without connecting to the cloud. In other words, FunctionGemma takes what a user says in natural language and turns it into precise, structured code that runs directly on apps and devices, enabling developers to achieve reliable, real-world execution without any cloud dependency.
This release marks a significant strategic pivot for Google DeepMind and the Google AI Developers team. While the industry continues to chase trillion-parameter scale in the cloud, FunctionGemma is a bet on “Small Language Models” (SLMs) running locally on phones, browsers, and IoT devices. By focusing on SLMs rather than solely on massive cloud models, FunctionGemma opens a new path to fast, secure, on-device AI that keeps power in the hands of developers, enterprises, and end users.
For AI engineers and enterprise builders, this model introduces a powerful new architectural primitive: a privacy-first “router” that executes complex logic directly on-device with negligible latency.
A New Level of Performance
At its core, FunctionGemma is designed to close the “execution gap” in generative AI. Standard large language models (LLMs) are excellent at conversational tasks, but they often struggle to trigger real software actions, especially on resource-constrained devices, reliably.
In Google’s internal “Mobile Actions” evaluation, a generic small model struggles with reliability, achieving only 58% baseline accuracy on function-calling tasks. However, once fine-tuned for this specific purpose, FunctionGemma’s accuracy jumps to 85%, creating a specialized model that delivers a success rate on par with models many times its size.

A chart visualizing FunctionGemma’s performance before and after fine‑tuning. Credit: Google.
It allows the model to handle far more than simple on/off switches; it can parse complex arguments, such as identifying specific grid coordinates to drive game mechanics or other detailed logic on its own.
The release includes more than just the model weights. Google is providing a full “recipe” for developers, which includes:
- The Model: A 270M parameter transformer trained on 6 trillion tokens.
- Training Data: A “Mobile Actions” dataset to help developers train their own agents.
- Ecosystem Support: Compatibility with Hugging Face Transformers, Keras, Unsloth, and NVIDIA NeMo libraries.
Omar Sanseviero, Developer Experience Lead at Hugging Face, highlighted the versatility of the release on X (formerly Twitter), noting that the model is “designed to be specialized for your own tasks” and can run in “your phone, browser or other devices.”
This local-first approach offers three distinct advantages:
- Privacy: Personal data (like calendar entries or contacts) never leaves the device.
- Latency: Actions happen almost instantly without waiting for a server round-trip. The small size means the speed at which it processes input is significant, especially with access to accelerators such as GPUs and NPUs.
- Cost: Developers don’t pay per-token API fees for simple interactions, making on-device AI both powerful and affordable.
For AI Builders: A New Blueprint for Production Workflows
For enterprise developers and system architects, FunctionGemma offers a new way to build AI systems. Instead of relying on one big, heavy model for everything, teams can use smaller, connected components that work together. Rather than sending every small user request to a massive, expensive cloud model like GPT-4 or Gemini 1.5 Pro, builders can run FunctionGemma at the edge as an intelligent “traffic controller.” It decides what can be handled quickly on the device and what should be sent to a larger model in the cloud only when needed.
Here is a simple way for AI builders to think about using FunctionGemma in real production workflows:
1- The “Traffic Controller” Architecture: In a real production setup, FunctionGemma acts as the first line of defense for your AI system. Running directly on the user’s device, it instantly handles standard, high-frequency commands like navigation, media control, and basic data entry—without reaching out to the cloud. When a request truly requires deep reasoning or broad world knowledge, the model can detect that and smartly route it to a larger cloud model. This hybrid approach cuts cloud inference costs, minimizes latency, and enables powerful use cases, such as routing each query to the most suitable sub-agent.
2- Reliable Results, Not Random Creativity: Enterprises don’t need their banking or calendar apps to be “creative” – they need them to be correct. The jump to 85% accuracy is clear proof that a focused, specialized model can beat a much larger one. When you fine-tune this small model on your own domain data, such as proprietary enterprise APIs, it becomes a stable, reliable tool that behaves predictably every time – precisely what is needed for serious, production use.
3- Privacy-First Compliance: In fields like healthcare, finance, or sensitive enterprise operations, sending data to the cloud can create compliance and security risks. Because FunctionGemma is efficient enough to run directly on-device – on NVIDIA Jetson, mobile CPUs, or in the browser with Transformers.js – sensitive data, such as PII or proprietary commands, can remain within the local network. This privacy-first setup helps teams protect to user data, meet strict to compliance rules, and still take full advantage of modern AI.
Licensing: Mostly Open, With Built‑In Safeguards

FunctionGemma is released under Google’s custom Gemma Terms of Use, not under a standard open-source license like MIT or Apache 2.0. For enterprise and commercial teams, this difference is essential.
Google describes Gemma as an “open model,” but it is not entirely “Open Source” according to the Open Source Initiative (OSI) definition.
Under the Gemma Terms, you are allowed to use the model for unrestricted commercial use, and you can modify it and redistribute it. However, the license also includes to clear Usage Restrictions. Developers are not allowed to use the model for certain activities, such as generating to hate speech or malware, and Google also reserves the right to update these terms of over time.
For the vast majority of startups and developers, the license is still permissive enough to build and sell commercial products with FunctionGemma. However, teams working on dual-use technologies, or those who need strict copyleft freedom, should carefully read the clauses on “Harmful Use” and attribution before fully adopting the model in their stack.



