Apple Details How HomePod Can Detect 'Hey Siri' From Across a Room, Even With Loud Music Playing

holysmokes

macrumors newbie
Sep 9, 2015
22
13
0
I'd like to see a scientific experiment using the most popular smart speakers. Perhaps the higher price for the HomePod could be more easily justified if it is superior.
 

Machead2012

macrumors regular
Nov 16, 2011
135
85
0
Even with the heavy discounts: $250.00 retail price during Black Friday the HomePod still stiffed out.
You cannot polish a turd even if Siri hears you with the music playing.
 

WilliamG

macrumors G3
Mar 29, 2008
8,962
2,387
0
Seattle
www.bighugenerd.com
So I actually figured out why Siri does this. If you are asking Siri to control something within the same room that the HomePod is located, it won't say anything. It assumes that you see the action occuring (since you are in that room) and doesn't say anything. When you are asking Siri to do something in another room, it'll let you know because you aren't there to see it.
[doublepost=1543869912][/doublepost]

Yea, I get that. I also get why Apple made the decision as well. I have found Apple Music to have a lot of the music I request though so when people come over - I tell them just to request a song. Always works.
To your first response, nope. I can ask the same thing over and over and get varying ways at which things happen, whether it’s in the room or not.
 

mi7chy

macrumors 603
Oct 24, 2014
5,863
6,807
0
There's nothing magical about far-field voice capture. Amazon is very open about it for developers.

https://developer.amazon.com/alexa-voice-service/dev-kits/

Choosing the Right Audio Front End




Number of Mics

How you want your users to interact with your product determines the number of microphones you select. Voice-enabled devices designed for closer, hands-free interaction can use 1 or 2-mic solutions, whereas far-field products with listening ranges from across the room can benefit from a 4 or 7-mic array. Keep in mind that additional mics may take up more physical space and add incremental costs to your product.




Mic Arrangement

Your product’s form factor determines the arrangement of microphones. Square or circular arrays in a horizontal plane are better for 360-degree, omni-directional listening often utilized in tabletop products like the Amazon Echo or Echo Dot. Linear arrays are better suited for uni-directional listening or wall-mounted products such as connected light switches and thermostats with Alexa built-in like the ecobee4.




Audio Algorithms

Voice processing algorithms enable your device to leverage the full capabilities of the mic array. Noise reduction improves speech recognition in noisy environments, beam forming helps locate the direction of speech, and acoustic echo cancellation allows the user to barge-in even when your device is playing loud music. These algorithms, combined with wake word engines, allow voice-initiated interactions and send clear, processed audio to the cloud.
 

ersan191

macrumors 65816
Oct 26, 2013
1,148
1,526
0
The physical technology behind the assistant is best in class. After training my iPhone, Hey Siri works every time. But voice transcription and capabilities are far behind I'm afraid. Siri gets what I say wrong all the time, and when she understands what I said she often gets the interpretation wrong. I am actually surprised when she gives me the answer I need beyond the simple questions they demo during keynotes.
Agreed, it’s safe to assume they are working on something better though.
 

gnasher729

macrumors P6
Nov 25, 2005
16,488
3,039
0
There's nothing magical about far-field voice capture. Amazon is very open about it for developers.
I'm so impressed what Amazon can do, and I'm so impressed how much effort you put into posting it on this site. Bezos will be thanking you and include you in his nighttime prayers.
 
  • Like
Reactions: tromboneaholic

Sinfonist

macrumors member
Jan 24, 2007
52
26
0
I'm sure they sound great too.
The 3rd generation are decent. For real listening, I'll use my hifi, but for casual listening they're fine.
The stereo mode is surprisingly good. For a bit more money (I paid £50), the full Echo has a better speaker,
but the difference is no longer as great as it used to be. Apple Music will also apparently be available on them soon.

Not surprisingly, Amazon seem to have sold a lot of these over the Black Friday week - they're out of stock until the end of January.
 

Bacillus

Suspended
Jun 25, 2009
2,685
2,166
0
I'm so impressed what Amazon can do, and I'm so impressed how much effort you put into posting it on this site. Bezos will be thanking you and include you in his nighttime prayers.
It has indeed little meaning other than demonstrating what Apple overhyped
 

tromboneaholic

Suspended
Jun 9, 2004
3,710
2,924
0
Clearwater, FL
There's nothing magical about far-field voice capture. Amazon is very open about it for developers.

https://developer.amazon.com/alexa-voice-service/dev-kits/
If you think that compares to Apple's work on Machine Learning, you obviously didn't look at Apple's blog that's linked to from this article.

One of the main differences is that Apple is doing the work on the device in low power, while Amazon and Google are sending the data home to the mothership so it can be analyzed on their servers.

In addition to obvious security and privacy benefits, Apple's approach has implications for future low power wearable devices that might not always be connected to the cloud.
 
  • Like
Reactions: citysnaps

ipponrg

macrumors 68000
Oct 15, 2008
1,616
1,236
0
If you think that compares to Apple's work on Machine Learning, you obviously didn't look at Apple's blog that's linked to from this article.

One of the main differences is that Apple is doing the work on the device in low power, while Amazon and Google are sending the data home to the mothership so it can be analyzed on their servers.

In addition to obvious security and privacy benefits, Apple's approach has implications for future low power wearable devices that might not always be connected to the cloud.
Apple’s efforts are somewhat admirable given that machine learning is usually processed with beefy hardware. I have little to no faith in their efforts as most of their first party software is pretty awful.

Nonetheless, this allows Apple to be the underdog and an opportunity to surprise the industry
 

citysnaps

macrumors 603
Oct 10, 2011
5,024
7,497
0
San Francisco
www.citysnaps.net
If you think that compares to Apple's work on Machine Learning, you obviously didn't look at Apple's blog that's linked to from this article.

One of the main differences is that Apple is doing the work on the device in low power, while Amazon and Google are sending the data home to the mothership so it can be analyzed on their servers.

In addition to obvious security and privacy benefits, Apple's approach has implications for future low power wearable devices that might not always be connected to the cloud.
Absolutely!

Apple's implementation consumes just 15% of a single A8 core running at 1.4 GHz. And it works so astonishingly well. It pays to have your own proprietary silicon. And brilliant scientists and engineers.

Can't wait to see what new capabilities/features/performance HomePod 2 delivers.

Compare the content of Apple's white paper with the above Amazon link. There's no comparison.
 
  • Like
Reactions: tromboneaholic

pelegri

macrumors member
Sep 8, 2007
31
7
0
I wonder if they can use the results of their research to improve hearing aids. The boomers are getting older...
There is a huge market out there... Include it in the airpod and we will look cool while hearing better :)
 

gblandon

macrumors newbie
Jan 6, 2012
2
2
0



In a new entry in its Machine Learning Journal, Apple has detailed how Siri on the HomePod is designed to work in challenging usage scenarios, such as during loud music playback, when the user is far away from the HomePod, or when there are other active sound sources in a room, such as a TV or household appliances.


An overview of the task:To accomplish this, Apple says its audio software engineering and Siri speech teams developed a multichannel signal processing system for the HomePod that uses machine learning algorithms to remove echo and background noise and to separate simultaneous sound sources to eliminate interfering speech.

Apple says the system uses the HomePod's six microphones and is powered continuously by its Apple A8 chip, including when the HomePod is run in its lowest power state to save energy. The multichannel filtering constantly adapts to changing noise conditions and moving talkers, according to the journal entry.

Apple goes on to provide a very technical overview of how the HomePod mitigates echo, reverberation, and noise, which we've put into layman's terms:Echo Cancellation: Since the speakers are close to the microphones on the HomePod, music playback can be significantly louder than a user's "Hey Siri" voice command at the microphone positions, especially when the user is far away from the HomePod. To combat the resulting echo, Siri on HomePod implements a multichannel echo cancellation algorithm.
Reverberation Removal: As the user saying "Hey Siri" moves further away from the HomePod, multiple reflections from the room create reverberation tails that decrease the quality and intelligibility of the voice command. To combat this, Siri on the HomePod continuously monitors the room characteristics and removes the late reverberation while preserving the direct and early reflection components in the microphone signals.
Noise Reduction: Far-field speech is typically contaminated by noise from home appliances, HVAC systems, outdoor sounds entering through windows, and so forth. To combat this, the HomePod uses state-of-the-art speech enhancement methods that create a fixed filter for every utterance.
Apple says it tested the HomePod's multichannel signal processing system in several acoustic conditions, including music and podcast playback at different levels, continuous background noise such as conversation and rain, and noises from household appliances such as a vacuum cleaner, hairdryer, and microwave.

During its testing, Apple varied the locations of the HomePod and its test subjects to cover different use cases. For example, in living room or kitchen environments, the HomePod was placed against the wall and in the middle of the room.

Apple's article concludes with a summary of Siri performance metrics on the HomePod, with graphs showing that Apple's multichannel signal processing system led to improved accuracy and fewer errors. Those interested in learning more can read the full entry on Apple's Machine Learning Journal.

Article Link: Apple Details How HomePod Can Detect 'Hey Siri' From Across a Room, Even With Loud Music Playing
That’s lovely. A cheap $350 device can accomplish, from far away, what my $1,450 iPhone XS Max cannot from inches away.
 
  • Like
Reactions: KPandian1

farewelwilliams

macrumors 68020
Jun 18, 2014
2,234
9,467
0
They may not have all the same fancy technology,.
You're literally proving my point. They don't have the fancy technology which means they can't do these key things that the machine learning article talks about:
Mask-Based Echo Suppression
Reverberation Removal
Mask-Based Noise Reduction
Unsupervised Learning with Top-Down Knowledge to Mitigate Competing Speech
Competing Talker Separation
Deep Learning–Based Stream Selection
so, no, your little £25 Dots do not do these things. and amazon dots can't even play loud music at max volume (i have one), so it doesn't even do what part of the title suggests.
 

MrGimper

macrumors 603
Sep 22, 2012
6,093
6,312
0
Andover, UK
You're literally proving my point. They don't have the fancy technology which means they can't do these key things that the machine learning article talks about:


so, no, your little £25 Dots do not do these things. and amazon dots can't even play loud music at max volume (i have one), so it doesn't even do what part of the title suggests.
But I don’t care what fancy names Apple give this stuff or how they fluff it up, I just care how it affects my usage. My echos can hear me and process what I say from other rooms when I have them playing music, or my Sonos, or my TV.

I don’t care about the steering rack in my car, I just care about turning the steering wheel and the front wheels turn, like it does on other cars.
 

Labrat561

macrumors newbie
Apr 20, 2009
25
9
0
Jupiter



In a new entry in its Machine Learning Journal, Apple has detailed how Siri on the HomePod is designed to work in challenging usage scenarios, such as during loud music playback, when the user is far away from the HomePod, or when there are other active sound sources in a room, such as a TV or household appliances.


An overview of the task:To accomplish this, Apple says its audio software engineering and Siri speech teams developed a multichannel signal processing system for the HomePod that uses machine learning algorithms to remove echo and background noise and to separate simultaneous sound sources to eliminate interfering speech.

Apple says the system uses the HomePod's six microphones and is powered continuously by its Apple A8 chip, including when the HomePod is run in its lowest power state to save energy. The multichannel filtering constantly adapts to changing noise conditions and moving talkers, according to the journal entry.

Apple goes on to provide a very technical overview of how the HomePod mitigates echo, reverberation, and noise, which we've put into layman's terms:Echo Cancellation: Since the speakers are close to the microphones on the HomePod, music playback can be significantly louder than a user's "Hey Siri" voice command at the microphone positions, especially when the user is far away from the HomePod. To combat the resulting echo, Siri on HomePod implements a multichannel echo cancellation algorithm.
Reverberation Removal: As the user saying "Hey Siri" moves further away from the HomePod, multiple reflections from the room create reverberation tails that decrease the quality and intelligibility of the voice command. To combat this, Siri on the HomePod continuously monitors the room characteristics and removes the late reverberation while preserving the direct and early reflection components in the microphone signals.
Noise Reduction: Far-field speech is typically contaminated by noise from home appliances, HVAC systems, outdoor sounds entering through windows, and so forth. To combat this, the HomePod uses state-of-the-art speech enhancement methods that create a fixed filter for every utterance.
Apple says it tested the HomePod's multichannel signal processing system in several acoustic conditions, including music and podcast playback at different levels, continuous background noise such as conversation and rain, and noises from household appliances such as a vacuum cleaner, hairdryer, and microwave.

During its testing, Apple varied the locations of the HomePod and its test subjects to cover different use cases. For example, in living room or kitchen environments, the HomePod was placed against the wall and in the middle of the room.

Apple's article concludes with a summary of Siri performance metrics on the HomePod, with graphs showing that Apple's multichannel signal processing system led to improved accuracy and fewer errors. Those interested in learning more can read the full entry on Apple's Machine Learning Journal.

Article Link: Apple Details How HomePod Can Detect 'Hey Siri' From Across a Room, Even With Loud Music Playing
[doublepost=1543953258][/doublepost]I've got a grandson that makes all kind of crazy sounds. This will be a good test.
 

The Game 161

macrumors Core
Dec 15, 2010
19,107
9,304
0
UK
But I don’t care what fancy names Apple give this stuff or how they fluff it up, I just care how it affects my usage. My echos can hear me and process what I say from other rooms when I have them playing music, or my Sonos, or my TV.

I don’t care about the steering rack in my car, I just care about turning the steering wheel and the front wheels turn, like it does on other cars.
As they should as the music they play isn’t loud at all.
 

Tech198

macrumors G5
Mar 21, 2011
13,887
1,623
0
Australia, Perth
All this is cool, but in reality how many would use it in this situation ? just because you can, or because you must now its here?

Shouting "hey Siri" across the room, just because your too lazy, is not my idea of fun at a party