Introducing Gemini Google's Largest and Most Capable AI Model

Described as the the most capable and general AI model, Gemini is the product of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research.
Gemini is multi-faceted with the ability to run on everything to data centers to mobile devices data centers to mobile devices. Gemini is the successor to LaMDA and PaLM 2. It will dramatically change the way AI is utilized by developers and enterprise customers:

Gemini is the result of large-scale collaborative efforts by teams across Google, including colleagues at Google Research. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.

According to Google, Gemini 1.0, the first version, is optimized for three different sizes:

Gemini Ultra — our largest and most capable model for highly complex tasks.
Gemini Pro — our best model for scaling across a wide range of tasks.
Gemini Nano — our most efficient model for on-device tasks.

Gemini Ultra’s natural image, audio and video understanding to mathematical reasoning capabilities exceeds current performance results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development:

With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.

Google’s new approach to MMLU enables Gemini to think more carefully before answering difficult questions rather than instantly answering them.

Because of Gemini’s comprehensive understanding of text, images, audio simultaneously it can understand nuanced information and complex questions which makes it especially adept at resolving complex subjects such as math and physics.

Gemini comprehends and produces high-quality code in Python, Java, C++, and Go, the world’s most popular programming languages. Gemini’s expertise in languages and complex reasoning makes it one of the leading foundation models for coding in the world:

Using a specialized version of Gemini, we created a more advanced code generation system, AlphaCode 2, which excels at solving competitive programming problems that go beyond coding to involve complex math and theoretical computer science.

Gemini has been developed to be more reliable, scalable and efficient than earlier, smaller and less-capable models by using using Google’s in-house designed Tensor Processing Units (TPUs) v4 and v5e. The Cloud TPU v5p, designed for training cutting-edge AI models is Google’s most powerful, efficient and scalable TPU system to date.

Gemini’s responsibility and safety goals include incorporating Google’s AI Principles and safety policies across its products and testing each stage of development for potential risks and working to correct them:

Gemini has the most comprehensive safety evaluations of any Google AI model to date, including for bias and toxicity. We’ve conducted novel research into potential risk areas like cyber-offense, persuasion and autonomy, and have applied Google Research’s best-in-class adversarial testing techniques to help identify critical safety issues in advance of Gemini’s deployment.

Gemini will be or is currently featured in several Google products.

Pixel 8 Pro is the first smartphone engineered to run Gemini Nano, which is powering new features like Summarize in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp
Gemini will be available in more of our products and services like Search, Ads, Chrome and Duet AI.

Introducing Gemini Google’s Largest and Most Capable AI Model

AI/Robotics

AI/Robotics

Software/Apps

Wearables

AI/Robotics

Hardware

Related News