Google teases an AI camera feature ahead of I/O that looks better than Rabbit R1's

Screenshot of Gemini AI — Google teases advancements in Gemini’s multi-modal AI capability.

Google

Ahead of its much-anticipated annual I/O event, Google released a teaser video on X showing off some new multimodal AI functionality that is sure to have the makers of Rabbit’s R1 quaking in their boots.

Also: What to expect from Google I/O 2024: Android 15, Gemini, Wear OS, and more

In the video, the user holds up their (Android) phone’s camera to the I/O stage and asks, “What do you think is happening here?” Google’s AI model Gemini responds, “It looks like people are setting up for a large event, perhaps a conference or a presentation.” Then, Gemini asks its own question: “Is there something in particular that catches your eye?”

When the user asks Gemini what the large letters on the stage mean, Gemini correctly identifies Google’s I/O developer conference. The question likely helped the AI gain contextual information, which in turn positioned it to provide more useful answers. The chatbot then follows up with another question: “Have you ever attended Google I/O?” The conversation appears natural and effortless, at least in the video.

During its R1 launch demo in April, Rabbit showed off similar multimodal AI technology that many lauded as an exciting feature. Google’s teaser video proves the company has been hard at work in developing similar functionality for Gemini that, from the looks of it, might even be better.

Google and Rabbit aren’t alone. Also today, OpenAI showed off its own updates during its OpenAI Spring Update livestream, including GPT-4o, its newest AI model that now powers ChatGPT to “see, hear, and speak.” During the demo, presenters showed the AI a host of different things via their smartphone’s camera, including a math problem written by hand, and the presenter’s facial expressions, with the AI correctly identifying these things through a similar conversational back-and-forth with its users.

Also: Why Meta’s Ray-Ban Smart Glasses are my favorite tech purchase this year

When Google updates Gemini on mobile with this feature, the company’s technology could jump to the front of the pack in the AI assistant race, particularly with Gemini’s exceedingly natural-sounding cadence and follow-up questions. Although the exact breadth of capabilities will be revealed at I/O, this development certainly puts Rabbit in a tricky position, making one of its standout features essentially redundant.

As with any demo that isn’t shown off live, you should take this one with a grain of salt. Still, the strategic release of this video just an hour before OpenAI’s livestream suggests Google will have a lot more to say about Gemini this week.

Source link

Breaking News

Top Tagged

+ There are no comments

LOTR Boss Talks Games And The Advantage Tolkien Has Over Marvel

The Best Pillows for Side Sleepers in 2024