Thousands of Amazon Employees Listen to Alexa Requests for Improvement Purposes [Updated]

How did you pretzel jump from companies listening to snippets to improve voice recognition software to a what if paranoid scenario that you wrote? Amazing if you, and others, really believe that’s the future.
Because many years ago I worked with direct mail (junk mail) marketing firms. The amount of information about a person that can be determined by cross checking data was astonishing back then. There isn’t one database there are thousands, probably millions by now. Some are public information and will have your name, address, marital status freely available. Others are company based and you have to buy access. From those you can get -frequently- income, job history, education, where you have lived and when.

For me that was 30 years ago. My wife is still in that business. The amount of data collected and collated on Americans is staggering. And small details can lead to new discoveries by businesses. Maybe all it means is you get more targeted ads. Maybe it means you get put on a suckers list and are hit by questionable charities and causes. Those could and did happen 30 years ago. With people not caring about their own privacy it happens more now.
 
When you buy an Echo the terms are clear, anything you say after triggering the wake word is recorded, how long that recording last depends on how long you engage with the device. When you log into the Alexa app it shows you everything that was recorded and sent to Amazon servers, you can even playback each recording yourself.

Right or wrong it's up to the user to decide whether they are comfortable with that.

Is Amazon covertly recording everything else you say? I don't believe they are. But I do believe you could leave it active for longer than you think and it will record it all.

Do you believe that despite anything you are told they are recording everything and that even if you delete the recordings (which you can) that they keep them anyway? Well, you're not going to buy one, are you?

Those that use these devices want them to work more accurately, they do that by listening to the ones the assistant couldn't understand and adjust the AI. That is the price you are going to have to pay for improvements to voice-operated systems.
 
Last edited:
At Siri headquarters, two besuited Apple agents sit opposite each other, headphones on, smoking galoise cigarettes. They are listening intently, scribbling on notepads while a reel-to-reel tape broadcasts the intimate details of my daily life.

‘Set a timer for five minutes’
‘Set a timer for ten minutes’
‘Set a timer for ten minutes’
‘Set a timer for three minutes’
‘Set a timer for eight minutes’
‘Set a timer for five minutes’
‘Set a timer for eleven minutes’
‘Set a timer for five minutes’
‘Set a timer for fifty minutes’
‘No! I said fifty minutes. Fifty. Not fifteen, Fifty! Five zero!’
‘Set a timer for five minutes’
‘Set a timer for nine minutes’
‘Set a timer for…

Hell in 2019?
 
You guys need to put the tin foil hats away. Or don’t. The comments are pure gold! lol
In this case we have to include tin foil underwear as well :D
[doublepost=1555008617][/doublepost]
These devices need to passively listen ALL the time in order to recognize a spoken keyword. I would assume that the processing of said audio input is first done on the device (to recognize the keyword, such as "Hey Siri," "Hey Google" and so on), and then if the keyword is recognized, the device goes into active listening and transmission mode, meaning that whatever you speak from that point on is transmitted over the Web for analysis to the speech recognition server. I wouldn't think that every single device constantly feeds audio data to the speech recognition server—this would require considerably more bandwidth and add significantly to the server load. That's why I'm a bit mystified as to how the human analysis team at the manufacturer gets all this audio data that is not prefixed with the "Hey [device]" recognition keyword. My understanding could be totally wrong, but the MacRumors article posted this time seems to imply that the human analysis team indeed has access to a live and active audio feed on these devices. Either that, or the device immediately starts sending data for speech recognition to the server whenever the ambient audio input exceeds a certain threshold... which seems kind of meaningless, since so much of the ambient audio that occurs in a room is non-verbal (moving chairs, closing doors, opening windows, listening to music, etc.).
... device could simply not stop transmitting after request is "fulfilled" or may be failed?
... device could be activated from the "company" randomly or intentionally ?
... device could be activated as you mentioned above as false positive trigger word(s)
 
Anyone really surprised by this? So gullible you people.
Not only do they listen, but whats more scary is they store it, then break it down into data analytics, and then store it again forever, tagged with your identity.
The processing power available to both AWS and MS Azure (and technologies rapidly evolved) is almost beyond belief with regard to big data and analysis. Your entire lives are being logged and catalogued and stored until needed by those who need to control you.
 
I’m seeing several comments saying that Apple users don’t have a setting to opt out of this stuff.

Does the “Share [device] analytics” count as Apple’s version of opt-in/out when it comes to their listening to Siri requests? Does every Apple device serve up that option when it’s bought new, and is that option still available in Settings on their devices?

Also, my understanding is that Siri doesn’t transmit data until a “Hey Siri” (or equivalent command) triggers it to pay attention and receives the user’s input. So a request isn’t sent until the user “consents” to its being sent by issuing the “hey Siri” command.

If all of this is true—and I’m not sure that it is?—it seems that Apple is providing opt-in/out to the whole “apple employees sitting around listening to anonymous recordings” scenario. They just don’t make it easy to understand.

(This doesn’t address the times that Siri is triggered accidentally, and it assumes that everything works as advertised. For all I know, a HomePod could be transmitting five minutes of audio from before and after the Siri command.)
 
Oh so this is a feature? Interesting.

Sorry, but an account number and and serial number are enough to track down where the recording came from. No one is saying they are store huge audio files in some underground server farm. The fact that the employees are able to listen to the recordings and match them with who they came from is the problem. I understand the need for them to listen to the recordings. Google and Apple randomizes this data. Clearly Amazon does not as seen below.

"According to Bloomberg, recordings sent to employees who work on Alexa don't include a user's full name or address, but an account number, first name, and the device's serial number are associated with the recording."

All you have done is nothing more then made an assumption, not a fact. You have made the assumption that an Amazon worker can personally identify you in the entire world, with your first name, a serial number and account number. You are assuming they have access to Amazons user database. You have no fact to prove that.
 
Incredibly unacceptable. What the hell is wrong with these large tech companies completely trampling on customer privacy.
What is with it? This is what they are. This is their primary focus, primary goal, primary purpose for existing. Everything else is ancillary.
 
Why isn't there a group of amazon employees that use the devices for this testing purpose instead of listening in on real people? I understand going through the data is helpful, I don't understand why they get a free pass to be lazy and not create the test data themselves opting for private data.
 
All you have done is nothing more then made an assumption, not a fact. You have made the assumption that an Amazon worker can personally identify you in the entire world, with your first name, a serial number and account number. You are assuming they have access to Amazons user database. You have no fact to prove that.
"Two workers told Bloomberg that they've heard recordings that are upsetting or potentially criminal, and while Amazon claims to have procedures in place for such occurrences, some employees have been told it's not the company's job to interfere."

If they can't narrow it down to specific people how would they have the ability to interfere?
 
I have mixed feelings about it. Privacy is important but on the other side Siri works so bad because of so Apple’s privacy policy.

You have mixed feelings about it?

If they want to improve their voice recognition they can hire people to speak requests, they can use movies or other recordings. Using real customer's recordings (without their knowledge/approval!) is not the way to go.
 
Mmm, yes... but they do need to passively listen for audio input constantly. Therein lies the gray zone. How much of that passive audio input is being transferred to the manufacturer's speech recognition server for processing, and later for human analysis? We have no way of knowing.
Actually you can get a pretty good idea about this. I use a FingBox (I bought mine 1-2 years ago on Kickstarter or Indiegogo but today they are available on Amazon). FingBox plugs into your router and monitors traffic to and from everything on your network (not the content but the amount of data moving between each device and your router). I used my FingBox to watch my Amazon Echos. When they are not active (they have not heard their activation word) the devices will pass a very small amount of data (maybe 10-15 Kb over the course of 10 minutes) to and from the router. This is not nearly enough to be an audio stream. However, as soon as I say the activation word "Alexa" and the blue ring lights up, I can watch a spike in data being transferred to and from the Echo unit and my router. As soon as my question has been answered the data stream drops back down to a couple Kb every few minutes.

I don't know what the content of those few Kb are but they are not enough for audio streaming. It may be some type of continual connection check but I can't say for sure what it is.
 
Seems hypocritical that the average Joes have no problem giving up their private identifying fingerprint and 3D face mapping but have an issue with people knowing their cooking recipe searches. Fear mongering is for the clueless to lap up.
Your fingerprints and Face ID data is not stored on your iOS device. I don't see why you are trying to conflate the two.
 
Really? A lot of people tend to believe the opposite of what you said. (Just reverse the names of the two companies)

Go on any Android forum and you’ll see how everyone prefers Amazon or Google over the HomePod. The HomePod has inferior sound, first off. Second, people want convenience, not sound quality. The HomePod is meant to be a music speaker you speak to to play music. Whaaaaaat? The sound quality on a phone, TV, laptop is more than enough. Who needs speakers???
 
I do like Amazons response to this ... They completely tip-toed around the issue, It's not weather the device is always on, we know that its only the trigger word when 'wakes' it up., and many know that..

Instead, it's weather 'the company' stores and listens on this info.. They never said anything on this, which proves it could be true.. How is it a secret to just say "We don't listen" ? That's not violating anything.
 
There are going to be some users who are willing to let the manufacturer listen to recordings to improve their accuracy and some who don’t. The ethical thing to do is to make the choice plain as day during setup and, from time to time, to remind the user of their choice and give them the option to change it.

I like Apple’s implementation of this when it comes to transcriptions of voicemails. On a voicemail by voicemail basis you decide if you want to share the audio with Apple. I’d like to see them implement this with Siri. That way if you have a bad interaction with Siri and want to push that to Apple to improve accuracy you could use your iPhone or iPad to send just the latest voice interaction and provide a method to do the same with the HomePod. That’s the best of both worlds.

Will it be abused? If you say no I’ve got a bridge in New York City to sell you. My default assumption with any IoT device is that if there’s a microphone somebody will hear something I didn’t want them to. The same goes with the camera. My parents used to tell me the old adage ”if you don’t want someone to see it then don’t write it down”. It has never been more relevant than it is today in the IoT age. If you don’t want someone to see you, record you and/or listen in on you then go IoT free in your home and have a nice Faraday cage for your smart phone and iPad when you get home and aren’t using them. Now excuse me while I go put on my tin foil hat. LOL!!!
 
Why isn't there a group of amazon employees that use the devices for this testing purpose instead of listening in on real people? I understand going through the data is helpful, I don't understand why they get a free pass to be lazy and not create the test data themselves opting for private data.
This was my question as well. Why use customers without their expressed permission?
 
A much better and more sensible write of this story...

http://www.pocket-lint.com/smart-home/news/amazon/147740-sometimes-people-at-amazon-listen-to-what-you-tell-alexa

All staff that listen to clips have to sign NDA’s, you didn’t read that in here did you? Oh and if it picks up sensitive voices etc they have a box called ‘critical data’ to click so it’s tagged as sensitive.

Seems in America it’s a national outrage, elsewhere no one cares and the full story is disclosed.. and no it is not sitting there recording every single thing you say unless you use the wake word..
[doublepost=1555025898][/doublepost]
"Two workers told Bloomberg that they've heard recordings that are upsetting or potentially criminal, and while Amazon claims to have procedures in place for such occurrences, some employees have been told it's not the company's job to interfere."

If they can't narrow it down to specific people how would they have the ability to interfere?

See the better story I posted. People need to read a story like this in other sites, not an Apple fan site using selected paragraphs....
[doublepost=1555026049][/doublepost]
Actually you can get a pretty good idea about this. I use a FingBox (I bought mine 1-2 years ago on Kickstarter or Indiegogo but today they are available on Amazon). FingBox plugs into your router and monitors traffic to and from everything on your network (not the content but the amount of data moving between each device and your router). I used my FingBox to watch my Amazon Echos. When they are not active (they have not heard their activation word) the devices will pass a very small amount of data (maybe 10-15 Kb over the course of 10 minutes) to and from the router. This is not nearly enough to be an audio stream. However, as soon as I say the activation word "Alexa" and the blue ring lights up, I can watch a spike in data being transferred to and from the Echo unit and my router. As soon as my question has been answered the data stream drops back down to a couple Kb every few minutes.

I don't know what the content of those few Kb are but they are not enough for audio streaming. It may be some type of continual connection check but I can't say for sure what it is.

I can confirm this with my Google WiFi which, not as advanced as your kit, still shows in the app activity of a device connected to it, in the form of data size going up and down, and Yeap with my echo it’s nothing pretty much, until you use it. As Amazon repeatedly state despite peoples attempts to claim otherwise...
 
Last edited:
Your fingerprints and Face ID data is not stored on your iOS device. I don't see why you are trying to conflate the two.

I hope this was posted in jest. Your Touch ID and Face ID data are stored in the Secure Enclave on your iOS device and it never leaves it unless someone gets physical access to the device and knows how to hack the enclave. That’s why when you setup any iOS device you have to jump through the Touch ID or Face ID setup routine. If this data were stored in the cloud as soon as you signed into a device Touch ID or Face ID would just work.

https://www.howtogeek.com/350676/how-secure-are-face-id-and-touch-id/
 
I hope this was posted in jest. Your Touch ID and Face ID data are stored in the Secure Enclave on your iOS device and it never leaves it unless someone gets physical access to the device and knows how to hack the enclave. That’s why when you setup any iOS device you have to jump through the Touch ID or Face ID setup routine. If this data were stored in the cloud as soon as you signed into a device Touch ID or Face ID would just work.

https://www.howtogeek.com/350676/how-secure-are-face-id-and-touch-id/

I know it’s stored in the Secure Enclave. What I meant was that it’s stored in encrypted form and even if someone managed to hack into the Secure Enclave, they wouldn’t have my fingerprint or face info. Just a bunch of 1 and 0 that is useless to them since they won’t have the accompanying security key.

As opposed to some companies who actually stored them as jpg files unsecured on their smartphones. Can you imagine?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.
Back
Top