Google’s AI Vision: Building smarter solutions with advanced multimodal models from the Gemini family

Discover how Google's Gemini model and integrated AI stack are setting new industry standards for smarter, faster, and easier AI development.

Monday June 17, 2024 , 4 min Read

Seema Ramachandra, Head of Customer Engineering at Google, recently spoke about how developers can leverage the Google stack to develop smarter, faster, and easier AI solutions. She was speaking at DevSparks 2024, a technology-focused summit by YourStory that brought together technology leaders, visionaries, and experts on a common platform.

In her keynote address, Ramachandra highlighted Google's long-standing focus on data analytics and AI, citing examples such as Google Search's ML-led suggestions and the company's translation services in 130 different languages as early as 2006.

She then introduced Google's latest innovation, Gemini, a powerful multi-modal model that has a context window of a million tokens. “This means that developers can process one hour of video, 30,000 lines of code, and about 700,000 words all in a single stream when building prompts for Gemini,” Ramachandra said.

Gemini is natively multimodal, cutting down on the need to process text, images, and video with different models. It also exceeds current state-of-the-art results on 30 out of 32 industry benchmarks, has sophisticated reasoning capabilities, and outperforms human experts on MMLU, which covers over 16,000 questions across 57 subjects, she added.

Design, build, and test with Gemini

Gemini can be used during the design phase to create a PRD (Product Requirements Document) for you. “It can generate text, summarise text, and bring it to you in a consumable way in the design phase. In the build phase, Gemini can build code for you, help complete code, document code, translate code and generate test cases.

“Once deployed, it can suggest the optimal production environment and help with post-deployment operations such as monitoring and processing logs that often run into terabytes,” Ramachandra said.

The importance of infrastructure and platform

Ramachandra emphasised the importance of infrastructure and platform for AI development that takes care of end-to-end needs that include building end-to-end data pipelines, feature engineering to fit the right set of parameters in the model, building the model, fine tuning it to improve accuracy, deploying it, and most importantly continuously monitor it in an automated way to trigger retraining of the model when drifts occur.

“All of this is very important when dealing with models at scale, which is why Google offers a fully vertically integrated stack that includes infrastructure, models and a platform called Vertex AI for end-to-end ML ops,” Ramachandra said.

This stack includes Tensor Processing Units (TPUs), which are purpose-built to accelerate machine learning tasks of training and inference “Your price:performance ratio gets a huge boost when working with our TPUs,” she added.

Additionally, Google offers access to a wide range of models in a single pane of glass,through the Model Garden in Vertex AI. This includes models from Google and more than 130 models from the Open Source and Google’s Partner ecosystem.Google’s models span across foundation models like Gemini, PaLM, Imagen etc., task-specific models for Speech-to-Text, Text-to-Speech, Video Intelligence etc. and domain-specific models like MedLM for Healthcare and Sec PaLM for Security.

Security and responsibility

Ramachandra also highlighted the importance of security and responsibility in AI development.

“Google ensures that customer data is not used to train Google’s foundation models. All data and models reside within the customer's Google tenant with explicit permission required for access. This provides complete auditability and transparency. ” she added.

In conclusion, Ramachandra emphasised that building great models on great platforms is just beginning to get more powerful and scalable, making them an indispensable tool to push the boundaries of what’s possible with technology. Google's focus on innovation and responsibility in AI development ensures that developers can build smarter, faster, and easier AI solutions while maintaining the highest levels of security.

To see the entire keynote, click here.