iSight and Microphone Patent

scan300 · Jun 29, 2003

If you read the patent application, noise suppression is only part of the picture. In earlier versions of speech recognition the computer would isolate the spectral pattern of the voice and use that pattern for analyses. The problem was that when a person moved their head their spectral pattern would change.

The approach in this patent uses a phoneme database at the front-end of a speech recognition engine connected to the monitor (Acoustic Model Selector ACM). The ACM selects sounds (after noise suppression is performed ) which best match regular speech sounds (vowels and consonants). They are moved to a back-end process which match the phoneme to a language and then works out the command.

The combination of multiple mics and software algorithms help in 'beam formation' of the voice ie selecting the best signal to noise ratio, which is then used to eliminate noise. The characteristics of the set-up (mic placement, type, axis etc) are stored in software and can be used by the speech recognition software to account for spectral characteristics. There also seem to be a few combinations of best mic placement as well as a 4 mic option depending on the display characteristics.

Beam formation is different to phase inversion as a noise reduction technique. My version on how this works is that the best signal from both highly directional mics are matched to form a beam, while other signals, which are weaker are suppressed.

A phase inversion technique requires an omni directional mic to monitor background noise and another directional mic for the voice (Or a polar set-up). The background noise would be phase inverted and added to the voice signal to remove the noise. This is really only a good technique if the noise is ambient ie, all around you. Eg reflected factory noise. I'm not sure this patent makes use of this technique.

Steak · Jun 30, 2003

This technology appears to be phase cancellation, as many have pointed to. Problem is, it rarely works, unless inspecialised environments. When you fly on a Boeing 747, engine noise is pumped into the cabin, reversed phasse, to lower noise. Same with a pilots headphones. here is the problem: If you are cancelling the ambient noise, you are also cancelling the voice itself. I don't understand how this would IMPROVE recognition.. The Voice would be just as skewed and cancelled as the noise. If they found a way to only pick out sounds between the noise floor, and about -30dB, it might work, but that is very difficult.
I have tried to build a noise cancelling system for my computer, using this same idea, along with surface-mount mics on computewr hardware and fan. It did help, and cut down noise, but if I spoke, my voice would also be phasey.
Unless apple somehow passed up the study of great audio engineers for the last 100 years (this idea has been around since the discovery of electricity) I don't think it would work.
In a specially designed room, with $2000 minimum altered frequency mics, you will get about a 50% improvement in ambient noise. Not really worth it....

iLilana · Jun 30, 2003

I was going to say something brilliant but someone else alread did.

frinky23 · Jun 30, 2003

Originally posted by BaghdadBob
Wouldn't a bluetooth headset be easier?

I don't think Apple wants to require you to buy additional hardware to use this...

Anyway, Bluetooth really is limited as far as voice quality is concerned. Bluetooth headsets for cell phones are just fine, but if you want higher quality sound Bluetooth just isn't going to have the bandwidth for it.

ouketii · Jun 30, 2003

hehe apple makes something, then patents the crap out of it... good stuff.

Pete_Hoover · Jun 30, 2003

I hope they take this technology and develope better speech recognition software. Something that actaully works, and is practical.

ClimbingTheLog · Jul 3, 2003

Originally posted by e-coli
I've yet to see a computer voice recognition program that works well. Or at least works well enough to ditch the mouse.

That's why the guy above mentioned lip reading. Far more accurate than sound wave decomposition.

Search

Search

iSight and Microphone Patent

scan300

macrumors 6502

Steak

macrumors newbie

iLilana

macrumors 6502a

frinky23

macrumors member

ouketii

macrumors regular

Pete_Hoover

macrumors regular

ClimbingTheLog

macrumors 6502a

Our Staff