Gemma: Google’s New Open-Source AI Model


March 5, 2024
Gemma, Google's new open-source AI model  

Gemma is designed for responsible AI development, which uses the same research and technology as the Gemini models. Google is excited to introduce this new generation of open models to assist developers and researchers in building AI responsibly.

Gemma Open Models

Google is releasing tools to support developer innovation, encourage collaboration, and ensure responsible use of Gemma models.

Here are some key details to know about Gemma:

  • Google is releasing two sizes of model weights: Gemma 2B and Gemma 7B, each with pre-trained and instruction-tuned variants.
  • The Responsible Generative AI Toolkit provides guidance and tools to help create safer AI applications using Gemma.
  • Google is providing toolchains for inference and supervised fine-tuning (SFT) across all major frameworks: JAX, PyTorch, and TensorFlow through native Keras 3.0.
  • Ready-to-use Colab and Kaggle notebooks, alongside integration with popular tools such as Hugging Face, MaxText, NVIDIA NeMo and TensorRT-LLM, make it easy to get started with Gemma.
  • Gemma models, both pre-trained and instruction-tuned, can run on your laptop, workstation, or Google Cloud, with easy deployment options on Vertex AI and Google Kubernetes Engine (GKE).
  • Gemma is optimized for top performance on various AI hardware platforms, including NVIDIA GPUs and Google Cloud TPUs.
  • The terms of use allow all organizations, regardless of size, to use and distribute Gemma responsibly for commercial purposes.

Gemma’s Performance Capability

Gemma models, share technical elements with Gemini AI model, Google’s largest and most capable AI model widely available today. This allows Gemma to outperform other open models of similar sizes. Most importantly, it outperforms much larger models on important benchmarks while meeting Google’s strict standards for safe and responsible results.

Gemma’s Responsible AI Design

Gemma is built following Google’s AI Principles to ensure safety and reliability. Google has used automated methods to remove personal and sensitive information from the training data. The models are fine-tuned based on human feedback to ensure responsible behavior. To further ensure the safety of Gemma models, thorough tests were conducted, including manual and automated checks, and evaluations to identify any potential risks.

Google is also introducing a new Responsible Generative AI Toolkit to assist developers and researchers in creating safe and responsible AI applications. This toolkit offers a new method to create strong safety checks with minimal examples, it helps understand and troubleshoot Gemma’s actions and you can access best practices for building and deploying large language models, based on Google’s experience.

Compatibility and Optimization of Gemma

You can customize Gemma models to suit specific tasks using your own data. Gemma is compatible with various tools and systems:

  • Multi-framework tools: You can use your preferred framework with Gemma, which supports Keras 3.0, PyTorch, JAX, and Hugging Face Transformers.
  • Cross-device compatibility: Gemma models can run on different devices, including laptops, desktops, mobiles, IoT devices, and in the cloud, making AI accessible across various platforms.
  • Advanced hardware platforms: Google teamed up with NVIDIA to make Gemma work efficiently on NVIDIA GPUs, from data centers to local RTX AI PCs, ensuring quality performance and integration with the latest technology.
  • Optimized for Google Cloud: With Vertex AI, you have MLOps tools, easy deployment, and can customize using managed Vertex AI or self-managed GKE on GPU, TPU, and CPU.

Why We Care

Gemma provides customizable open models that can be tailored to specific tasks. With Gemma, businesses can develop more efficient and personalized AI-driven applications, that will lead to enhanced user experiences and increased customer engagement. Moreover, Gemma’s commitment to safety and reliability aligns with responsible AI practices, reducing the risk of deploying flawed or biased AI solutions.