Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
If you read the patent application, noise suppression is only part of the picture. In earlier versions of speech recognition the computer would isolate the spectral pattern of the voice and use that pattern for analyses. The problem was that when a person moved their head their spectral pattern would change.

The approach in this patent uses a phoneme database at the front-end of a speech recognition engine connected to the monitor (Acoustic Model Selector ACM). The ACM selects sounds (after noise suppression is performed ) which best match regular speech sounds (vowels and consonants). They are moved to a back-end process which match the phoneme to a language and then works out the command.

The combination of multiple mics and software algorithms help in 'beam formation' of the voice ie selecting the best signal to noise ratio, which is then used to eliminate noise. The characteristics of the set-up (mic placement, type, axis etc) are stored in software and can be used by the speech recognition software to account for spectral characteristics. There also seem to be a few combinations of best mic placement as well as a 4 mic option depending on the display characteristics.

Beam formation is different to phase inversion as a noise reduction technique. My version on how this works is that the best signal from both highly directional mics are matched to form a beam, while other signals, which are weaker are suppressed.

A phase inversion technique requires an omni directional mic to monitor background noise and another directional mic for the voice (Or a polar set-up). The background noise would be phase inverted and added to the voice signal to remove the noise. This is really only a good technique if the noise is ambient ie, all around you. Eg reflected factory noise. I'm not sure this patent makes use of this technique.
 
This technology appears to be phase cancellation, as many have pointed to. Problem is, it rarely works, unless inspecialised environments. When you fly on a Boeing 747, engine noise is pumped into the cabin, reversed phasse, to lower noise. Same with a pilots headphones. here is the problem: If you are cancelling the ambient noise, you are also cancelling the voice itself. I don't understand how this would IMPROVE recognition.. The Voice would be just as skewed and cancelled as the noise. If they found a way to only pick out sounds between the noise floor, and about -30dB, it might work, but that is very difficult.
I have tried to build a noise cancelling system for my computer, using this same idea, along with surface-mount mics on computewr hardware and fan. It did help, and cut down noise, but if I spoke, my voice would also be phasey.
Unless apple somehow passed up the study of great audio engineers for the last 100 years (this idea has been around since the discovery of electricity) I don't think it would work.
In a specially designed room, with $2000 minimum altered frequency mics, you will get about a 50% improvement in ambient noise. Not really worth it....
 
Originally posted by BaghdadBob
Wouldn't a bluetooth headset be easier?

I don't think Apple wants to require you to buy additional hardware to use this...

Anyway, Bluetooth really is limited as far as voice quality is concerned. Bluetooth headsets for cell phones are just fine, but if you want higher quality sound Bluetooth just isn't going to have the bandwidth for it.
 
I hope they take this technology and develope better speech recognition software. Something that actaully works, and is practical.
 
Originally posted by e-coli
I've yet to see a computer voice recognition program that works well. Or at least works well enough to ditch the mouse.

That's why the guy above mentioned lip reading. Far more accurate than sound wave decomposition.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.