Frontier Models

Google DeepMind Releases Gemma 4 12B On-Device Multimodal Model

Google DeepMind released Gemma 4 12B on June 3, 2026. The encoder-free multimodal model runs on standard laptops with 16GB RAM, delivers performance nearing the 26B MoE variant, and supports up to 256K tokens across 140 languages.

June 7, 2026 2 MIN READ

A person sits at a desk in a home office using a standard laptop to interact with an on-device multimodal AI model, with the laptop screen visible alongside a smartphone for cross-device input. — Illustration: AI Intel Report

Gemma 4 12B is a unified, encoder-free multimodal model in the Gemma 4 family developed by Google DeepMind for on-device deployment.

Google released the model on June 3, 2026. It uses a single decoder-only transformer that projects raw image patches and audio waveforms directly into the embedding space via lightweight linear layers.

Details are available from the Google blog post announcing the model and the official Gemma model card.

What benchmarks has Gemma 4 12B achieved?

The model reports strong results across multiple evaluations.

How does the encoder-free design benefit on-device AI?

By eliminating dedicated encoders, the architecture reduces complexity and memory requirements, allowing the 12B model to run efficiently on standard consumer laptops equipped with 16GB of RAM or unified memory.

Gemma 4 12B delivers performance nearing our larger 26B MoE model on standard benchmarks, but at less than half the total memory footprint.Olivier Lacombe and Gus Martins, Director of Product Management and Product Manager, Google DeepMind

What are the main specifications of the Gemma 4 12B model?

Gemma 4 12B Key Specifications
Specification	Details
Parameters	12 billion
Context Window	Up to 256K tokens
Languages Supported	Over 140
License	Apache 2.0
Availability	Hugging Face as google/gemma-4-12B-it

What input modalities does Gemma 4 12B support?

Text inputs and generation
Image understanding and analysis
Audio waveform processing
Video input handling

The model is part of a broader family including E2B, E4B, 26B A4B MoE, and 31B variants. It is optimized for Google AI Edge and LiteRT-LM.

Weights are released under Apache 2.0 and accessible via Hugging Face.

Frequently asked

When was the Gemma 4 12B model released by Google?

Google released Gemma 4 12B on June 3, 2026, as a new addition to its open model family.

On what hardware can Gemma 4 12B run?

The model is designed to run on standard consumer laptops with 16GB of RAM or unified memory.

Sources

Google — Gemma 4 12B delivers performance nearing our larger 26B MoE model on standard benchmarks, but at less than half the total memory footprint.
Google — Gemma 4 12B Unified achieves 77.2% on MMLU Pro, 72.0% on LiveCodeBench v6, and 69.1% on MMMU Pro (vision).