Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacRumors

macrumors bot
Original poster
Apr 12, 2001
68,126
38,888



A new entry in Apple's Machine Learning Journal provides a closer look at how hardware, software, and internet services work together to power the hands-free "Hey Siri" feature on the latest iPhone and iPad Pro models.

hey-siri.jpg

Specifically, a very small speech recognizer built into the embedded motion coprocessor runs all the time and listens for "Hey Siri." When just those two words are detected, Siri parses any subsequent speech as a command or query.

The detector uses a Deep Neural Network to convert the acoustic pattern of a user's voice into a probability distribution. It then uses a temporal integration process to compute a confidence score that the phrase uttered was "Hey Siri."

If the score is high enough, Siri wakes up and proceeds to complete the command or answer the query automatically.

If the score exceeds Apple's lower threshold but not the upper threshold, however, the device enters a more sensitive state for a few seconds, so that Siri is much more likely to be invoked if the user repeats the phrase--even without more effort.

"This second-chance mechanism improves the usability of the system significantly, without increasing the false alarm rate too much because it is only in this extra-sensitive state for a short time," said Apple.

To reduce false triggers from strangers, Apple invites users to complete a short enrollment session in which they say five phrases that each begin with "Hey Siri." The examples are saved on the device.
We compare the distances to the reference patterns created during enrollment with another threshold to decide whether the sound that triggered the detector is likely to be "Hey Siri" spoken by the enrolled user.

This process not only reduces the probability that "Hey Siri" spoken by another person will trigger the iPhone, but also reduces the rate at which other, similar-sounding phrases trigger Siri.
Apple also says it created "Hey Siri" recordings both close and far in various environments, such as in the kitchen, car, bedroom, and restaurant, based on native speakers of many languages around the world.

For many more technical details about how "Hey Siri" works, be sure to read Apple's full article on its Machine Learning Journal.

Article Link: Apple Says 'Hey Siri' Detection Briefly Becomes Extra Sensitive If Your First Try Doesn't Work
 
  • Like
Reactions: Ribitsch
"Always listening" ha.... first thing in the morning? Nope, won't work until I open my phone. It seems to stop 'listening' for me after about 2 hours of inactivity on my phone. And yes I have the always listening turned on.
 
There's a lot going on behind the scenes with Siri. I don't think we give her enough credit.
My biggest problem with Siri is not any of the silly one-off bugs that get memed to death on the internet.

My biggest problem was described beautifully in a recent article I read somewhere. About how a voice assistant with 10 possible working commands is great. And one with unlimited working commands is great. But one with hundreds of working commands is terrible, because the user will never know what all those commands are. They will just use the core few that they know. And if they try a command and it isn't one of those hundreds, it causes confusion and doubt.
 
There has to be something new going on with regards to it’s understanding. After initiating Hey Siri in the car, her comprehension through the Bluetooth system has been notably better for me during the past year. I was wondering if the 7 was doing some machine learning locally, or if the cloud was parsing my questions better with this device. My 5S still struggles with my voice, though it could also be the mics on that device too.

My biggest problem with Siri is not any of the silly one-off bugs that get memed to death on the internet.

My biggest problem was described beautifully in a recent article I read somewhere. About how a voice assistant with 10 possible working commands is great. And one with unlimited working commands is great. But one with hundreds of working commands is terrible, because the user will never know what all those commands are. They will just use the core few that they know. And if they try a command and it isn't one of those hundreds, it causes confusion and doubt.

Apple would help Siri’s reputation a lot, by keeping an active Wiki going for the service, and what commands it will respond to. Because many people have tried to use Siri once for a specific task, found it didn’t work, and since given up. It is really hard for a general user to discover new tricks.

Places like iMore do a decent job of documenting, but, it would be AWESOME if the source had a good manual for it..
 
Was wondering why "Hey Siri" on my iPhone 8 with iOS 11 now activates when other people are saying "Hey Siri". It also seems to activate if I use the word "Siri" in the middle of a sentence. My iPhone 5S with iOS 9 and 10 never had these issues, so it makes sense that something has changed when the introduced Always-on Hey Siri.

Edit: Apple says in the doc that the goal for the feature was for one false detection of hearing "Hey Siri" per week, and also once per week for falsely detecting "Hey Siri" from the wrong person. In light of this, then it is functioning correctly according to design:

"So in addition to the offline measurements described previously, we estimate false-alarm rates (when Siri turns on without the user saying “Hey Siri”) and imposter-accept rates (when Siri turns on when someone other than the user who trained the detector says “Hey Siri”) weekly by sampling from production data, on the latest iOS devices and Apple Watch."
 
Last edited:
My 5S also struggles, to the point I never ask Siri anything. Amazon's engine is much better. Alexa always gets my query, though the response at this point is small for obscure questions. But for music, HVAC control, timers, alarms spot on.
 
I recently started using Hey Siri, and it works pretty well in my experience.

There are still too many web searches, and sometimes I wish it was more context aware. When I say, "what time is the pats game?" on a Sunday, I am not talking about the junior hockey team called the Regina Pats and their game against the Brandon Wheat Kings. Maybe if I was in Canada, this would be an appropriate answer, but it knows from location services that I am firmly in the epicenter of Patriots territory.

Also, follow-up questions should be context aware. "What is the high temperature tomorrow?" "What will be the low?" It doesn't ever get the second question.
 
There's a lot going on behind the scenes with Siri. I don't think we give her enough credit.

Is Siri a "participation trophy" kind of gal? Here inner working are interesting but ultimately her timely result is all that matters. If I as a consumer get frustrated with her it's a fail no matter how hard her coprocessor is churning.

94% of the time she works just fine. But those other 6% times I want to toss a brick at my iPad or smash my phone.
 
Turned off Siri a few weeks ago and haven't regretted it once.

It was constantly triggering from my watch and phone when I didn't want it to, even without "Hey Siri" turned on, which I would then have to get out of to do what I actually wanted to do. Then, when I would actually try to use Siri, I was disappointed with the results ~80% of the time.

Not only does Siri have a long way to go for things like context awareness, but until they make Siri so that it tries to complete the task on-device first, I'm not interested in the least.
 
I always thought that it shouldnt be "Hey Siri"....it should be "Hey Apple". Better brand recognition. Just like "okay google". Just sounds better.
 
I dislike Siri purely because of the snark and attitude. Siri's use case is to be my virtual assistant and to help me manage basic simple tasks. I can't imagine having a human assistant with the same level of snark as Siri.

I also can't stand Apple's attempts to make Siri cute on April Fool's Day. On many occasions over the past year, I've asked Siri to "flip a coin" - a simple command with a simple answer that helps me and the coworkers settle on a place for lunch (as an example). Well, April 1st rolls around and Siri couldn't give a real answer to that question all day long. She kept claiming "the coin landed on its side - what are the chances..." - I quit using Siri after that. Apple ruined a tool that was useful and served a purpose for the sake of playing "me too" with Google and other April Fool's Day gags.
 
  • Like
Reactions: lupinglade and aylk
I wonder if we will be able to use more than one Hey Siri on the HomePod. It would be nice if it could tell the difference between my son, my wife and I just based off of that command.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.