Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacRumors

macrumors bot
Original poster
Apr 12, 2001
63,481
30,716


Apple researchers have developed an artificial intelligence system named ReALM (Reference Resolution as Language Modeling) that aims to radically enhance how voice assistants understand and respond to commands.

hey-siri-banner-apple.jpg

In a research paper (via VentureBeat), Apple outlines a new system for how large language models tackle reference resolution, which involves deciphering ambiguous references to on-screen entities, as well as understanding conversational and background context. As a result, ReALM could lead to more intuitive and natural interactions with devices.

Reference resolution is an important part of natural language understanding, enabling users to use pronouns and other indirect references in conversation without confusion. For digital assistants, this capability has historically been a significant challenge, limited by the need to interpret a wide range of verbal cues and visual information. Apple's ReALM system seeks to address this by converting the complex process of reference resolution into a pure language modeling problem. In doing so, it can comprehend references to visual elements displayed on a screen and integrate this understanding into the conversational flow.

ReALM reconstructs the visual layout of a screen using textual representations. This involves parsing on-screen entities and their locations to generate a textual format that captures the screen's content and structure. Apple researchers found that this strategy, combined with specific fine-tuning of language models for reference resolution tasks, significantly outperforms traditional methods, including the capabilities of OpenAI's GPT-4.

ReALM could enable users to interact with digital assistants much more efficiently with reference to what is currently displayed on their screen without the need for precise, detailed instructions. This has the potential to make voice assistants much more useful in a variety of settings, such as helping drivers navigate infotainment systems while driving or assisting users with disabilities by providing an easier and more accurate means of indirect interaction.

Apple has now published several AI research papers. Last month, the company revealed a new method for training large language models that seamlessly integrates both text and visual information. Apple is widely expected to unveil an array of AI features at WWDC in June.

Article Link: Apple Researchers Reveal New AI System That Can Beat GPT-4
 

aknabi

macrumors 6502a
Jul 4, 2011
518
839
I assume anything their current research is talking about won't impact their offerings for several years and in the meantime they'll do what they did with outsourcing Maps until they got their solution "ready" (of course then there was the bumps until it was a competitive offering, which will likely be more so with AI)
 

cicalinarrot

macrumors 6502a
Apr 28, 2015
504
1,607
Better than GPT4 like Gemini or for real?
And will they keep improving it or just wait for obsolescence like they did with Siri’s capabilities?
 

b-dogg

macrumors regular
Jul 10, 2009
164
205
It wouldn't be hard to beat Siri's voice recognition! Siri has been absolutely pitiful given the resources - both financial, dataset and monetary - it has had to fix it over the years.
 

antiprotest

macrumors 68040
Apr 19, 2010
3,980
13,880
I am invested in Apple products and services, so I am rooting for Apple as a self-serving customer. However, I won't believe anything Apple claims about AI until it happens. I will believe OpenAI, Google, Meta, Microsoft, even McDonald's about AI before I believe Apple.
 

CarAnalogy

macrumors 601
Jun 9, 2021
4,196
7,722
Unless I'm missing something deeper, pronouns doesn't seem to be a huge revolution. I've never had an issue using references with ChatGPT. I can give it a single word follow-up prompt to something either of us just said and it always knows exactly what I meant.

Of course if current Siri is the bar, that won't be hard to beat. Even with the new follow-up feature in iOS 17 it still has zero context and is still worse than useless.
 
  • Like
Reactions: arkitect

CarAnalogy

macrumors 601
Jun 9, 2021
4,196
7,722
I am invested in Apple products and services, so I am rooting for Apple as a self-serving customer. However, I won't believe anything Apple claims about AI until it happens. I will believe OpenAI, Google, Meta, Microsoft, even McDonald's about AI before I believe Apple.

I would be perfectly happy if for now they just put ChatGPT in front of Siri until they get their own thing working. It cannot possibly be worse than it is now.

At this point all the resources are out there for them to just duplicate an open source one on their own. They just have to spend the money and spin up the hardware to support it all.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.