What is Project Astra?

1 min read
What is Project Astra? Blog Image


Recently, Google at the company’s annual developer conference, presented an early version of Project Astra.

About Project Astra: 

  • It is a new multimodal AI agent developed by Google.
  • It is capable of answering real-time questions fed to it through text, video, images, and speech by pulling up the relevant information.
  • It can see the world, remember where one has left a thing and even answer if a computer code is correct by looking at it through the phone’s camera.
  • It even answers if a computer code is correct by looking at it through the phone’s camera.
  • It is more straight-forward, there is no range of emotional diversity in its voice.
  • It is not limited to smartphones. Google also showed it being used with a pair of smart glasses.
  • Project Astra can learn about the world, making it as close as possible to a human-assistant-like experience. 

What is multimodal model AI?

  • A multimodal model is a ML (machine learning) model that is capable of processing information from different modalities, including images, videos, and text.
    • For example, Google's multimodal model, Gemini, can receive a photo of a plate of cookies and generate a written recipe as a response and vice versa.
  • This model expands on generative capabilities, processing information from multiple modalities, including images, videos, and text. Multimodality can be thought of as giving AI the ability to process and understand different sensory modes.

Q1: What is Generative AI?

Generative AI, or generative artificial intelligence, is a form of artificial intelligence (AI) in which algorithms automatically produce content in the form of text, images, audio, and video.

Source: AI’s ‘Her’ moment: OpenAI’s GPT-4o and Google’s Project Astra make real-life strides