Google I/O 2024 kicked off with a keynote address focused on Gemini, its artificial intelligence (AI) model that is set to get new capabilities to become the foundational model powering its services such as Search, Photos, Workspace, Android, and more.
With Gemini, Google said, the goal is to make AI helpful for everyone. On that note, Google announced that it is expanding “AI overviews in Search” to everyone in the US this week and to more countries soon. While this was a long time coming, Google threw in a surprise with the Gemini-powered “Ask Photos” feature for Google Photos. It essentially lets you search your entire library on Google Photos and follow up the results with even more complex prompts. More details on the “Ask Photos” will be available later this year, which is when the feature is slated to roll out.
About the Gemini itself, the model has been updated with new capabilities, said Google. Called Gemini 1.5 Pro, the new and improved version will be available to all developers globally. In addition, Google announced that Gemini 1.5 Pro with one million contexts is now directly available for consumers in Gemini Advanced. This can be used across 35 languages.
Here is a roundup of everything Google announced at I/O 2024 keynote
Gemini in Workspace
Google said that it is rolling out the Gemini 1.5 Pro model to its paid-tier customers with a new side panel on Workspace apps such as Gmail, Drive, Docs, Sheets, and more. The side panel resembles Microsoft’s Copilot side panel on desktops and offers better accessibility to AI from any Workspace app.
Another feature coming to Workspace is the new Gemini AI teammate, which is essentially an AI-powered assistant for Workspace apps. The Gemini Teammate has its own Google Account and can be incorporated into groups within Chats.
Google Project Astra
Google’s Project Astra is a multimodal AI agent with real-time spatial understanding. Google said that the AI agent is capable of understanding objects in a physical space and can process the data in real time. It can basically watch and remember what it sees through your device’s camera and can respond to prompts based on it. Google said that the AI agent will be powering the company’s Gemini product starting later this year.
AI in Search
One of the biggest takeaways from Google’s announcement is AI in Search. The search engine will soon get the ability to analyze and search based on video inputs, similar to how it does with images using Google Lens.
Google said that the Search is backed by a custom Gemini AI model and gets improved contextual understanding. Search results get AI-powered overviews, which were previously part of the Search Generative Experience (SGE) and were available as an experimental feature. Leveraging the Gemini AI, Google said, Search can break longer queries into smaller parts for better understanding as well.
Circle to Search
Circle to search for Android is set to get new features. Google said that the updated version of the feature will allow users to simply circle a mathematical problem and Google’s AI will provide steps that should make it easy to solve the question.
Smarter Gemini Assistant for Android
Google said that the Gemini AI assistant for Android will soon be able to harness multimodality by understanding the video playing on the display and letting users ask questions based on the video. The assistant will also gain the ability to answer the user’s query based on a document such as a PDF file.
Gemini is also getting a new “Live feature” that will allow it to understand live videos in real time and will be able to hold a more natural conversation with the user.
Gems: Custom Gemini chatbots
Google said that it will soon allow Gemini Advanced subscribers to create custom chatbots for carrying out a specific task. The feature is similar to custom GPTs on OpenAI’s ChatGPT.
Scam Call Detection on Android
Using the on-device Gemini Nano model, select Android-powered smartphones will soon be able to detect if the phone call received is a scam call. Google said that the feature will understand the conversation pattern during the phone call and will notify the user if it thinks the ongoing call is a scam call. According to Google, call data will be processed on the device for privacy and security.
Google Veo
Google is set to rival OpenAI’s Sora with its new generative AI model called Veo, which the company said will be able to generate videos in 1080p resolution. The model will generate videos based on text, image, and video-based prompts and will allow users to further edit the generated video with more prompts.
The article originally appeared on Business Standard.