Google’s Gemma 3n brings powerful AI to devices

The Rundown: Google launched the full version of Gemma 3n, its new family of open AI models (2B and 4B options) designed to bring powerful multimodal capabilities to mobile and consumer edge devices.

The details:

  • The new models natively understand images, audio, video, and text, while being efficient enough to run on hardware with as little as 2GB of RAM.
  • Built-in vision capabilities analyze video at 60 fps on Pixel phones, enabling real-time object recognition and scene understanding.
  • Gemma’s audio features translate across 35 languages and convert speech to text for accessibility applications and voice assistants.
  • Gemma’s larger E4B version becomes the first model under 10B parameters to surpass a 1300 score on the competitive LMArena benchmark.

Why it matters: The full Gemma release is another extremely impressive launch from Google, with models continuing to get more powerful despite shrinking in size for consumer hardware. The small, open model opens up limitless intelligent on-device use cases.

Source: The Rundown