
Google is integrating AI features from its Bard chatbot into Google Assistant, enabling the virtual assistant to interpret images and access data from documents and emails.
In May, Google responded to OpenAI’s ChatGPT by launching its own generative AI features across products like its search engine and the Android operating system, and introducing its chatbot, Bard. However, Google Assistant, the company’s rival to Siri and Alexa, did not receive these AI updates.
At its Pixel hardware event in New York today, Google announced a long-awaited update to Google Assistant. Sissie Hsiao, Google’s VP and GM for Google Assistant, introduced a new version that combines the capabilities of Google Assistant and Bard. This “multimodal” assistant goes beyond voice commands by also interpreting images and managing a variety of tasks, such as trip planning, summarizing emails, and even crafting social media captions, Hsiao shared in an interview with WIRED.
Courtesy of Google
The new generative AI experience is still in its early stages, and according to Hsiao, it doesn’t yet qualify as an “app.” When asked about how it might look on users’ phones, company representatives were vague about its final form. (Was the announcement timed to coincide with the hardware event? Quite possibly.)
The Bard-powered Google Assistant will use generative AI to handle text, voice, or image queries, responding in text or voice. Initially, it will be available to select users on mobile devices only, requiring an opt-in. It may function as a full-screen app or overlay on Android, and on iOS, it could be integrated into one of Google’s apps. This update follows Amazon’s Alexa becoming more conversational and OpenAI’s ChatGPT going multimodal. One unique feature of the upgraded Google Assistant is its ability to discuss the webpage a user is currently viewing on their phone.
The integration of generative AI into Google Assistant raises questions about how quickly Google will apply large language models to more of its products, potentially transforming their functionality and monetization strategies.
Gain of Function
Google has spent years highlighting the capabilities of its Google Assistant, launched in 2016, and recently promoting Bard as a conversational, AI-powered collaborator. So, what happens when they’re combined in the existing Assistant app?
According to Hsiao, this integration merges the Assistant’s personalized support with Bard’s generative and reasoning abilities. For example, since Bard is now integrated with Google’s productivity apps, it can find and summarize emails or answer questions about work documents. These functions can now be accessed through Google Assistant, allowing users to request information about documents or emails via voice and have the summaries read aloud.
The new connection with Bard also enhances the Assistant’s ability to interpret images. While Google Lens can identify images or provide shopping links, the upgraded Assistant will understand the context of the photo. For instance, if you see a picture of a hotel on Instagram, you could ask the Assistant to find more details and check availability on your chosen dates.
This integration could make Google Assistant a powerful shopping tool, connecting products in images with online stores. Although Google hasn’t integrated commercial listings into Bard yet, Hsiao indicated this could be a future possibility based on user demand.
Proceed With Caution
When Google Assistant first launched in 2016, AI’s language capabilities were limited. The rise of large language models, like those powering ChatGPT, has revolutionized the ability of voice assistants to understand and respond to complex queries, making natural conversations possible.
David Ferrucci, CEO of Elemental Cognition and former lead of IBM’s Watson project, explains that large language models have simplified the development of useful voice assistants. Previously, creating systems that could understand complex commands required extensive hand-coding, making them fragile and prone to errors. “Large language models give you a huge lift,” he says.
However, Ferrucci cautions that these models aren’t ideal for providing precise and reliable information, and building truly useful voice assistants still demands careful engineering.
More advanced and lifelike assistants could influence user behavior. The widespread adoption of ChatGPT has led to confusion about its underlying technology and limitations. Motahhare Eslami, an assistant professor at Carnegie Mellon University, notes that the confidence of chatbots like ChatGPT can lead people to trust them more than they should.
Eslami also warns that users might anthropomorphize a voice assistant with fluent speech, blurring the line between human-like interactions and the technology’s actual capabilities. Additionally, there is a risk of these systems subtly reinforcing harmful biases, particularly around race. “I’m a fan of the technology, but it comes with limitations and challenges,” she says.
Tom Gruber, cofounder of Siri, believes large language models will significantly enhance voice assistants’ capabilities in the coming years but may also introduce new challenges. He sees personalization based on personal data as both a major opportunity and a risk. An assistant with access to a user’s emails, messages, and browsing history could provide valuable insights and assist in more natural conversations. However, it also creates a potential repository of sensitive data.
“It’s inevitable that we’ll build a personal assistant that acts as your memory, tracking your experiences and augmenting your cognition,” says Gruber. “Apple and Google could do this, but they must make strong guarantees to protect user data.”
Hsiao’s team is considering ways to further enhance Assistant using Bard and generative AI, potentially leveraging personal information to provide more customized responses or automate tasks like making reservations. However, she emphasizes that development in this area is still in the early stages. It will take time before virtual assistants are ready to handle complex tasks or use a user’s credit card. “Maybe in a few years, this technology will be advanced and trusted enough for that, but we’ll need to proceed cautiously and learn along the way,” she says.
0 Comments
No comments yet. Be the first to comment!
Post a comment