Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Given how choosy they can be about content on the App Stores it's kind of funny to imagine how much profanity must be on Apple's servers via Siri...
 
Why is Apple always on the news for privacy when other companies (Google) have shadier dealings and rarely get called ou?t :-/
 
We all knew this processing wasn't being done "on board" the phone. So of course it is being downloaded at Apple and processed. Does anyone truly believe Apple is sorting through hundreds of millions of voice clips to determine where anyone of us is going for lunch tomorrow? I can't imagine anyone that is working with anything critically classified is going to be using SIRI to research or document it anyway. It's a useful tool, but it isn't being used to digitize the formula for Coke or KFC's secret recipe.
 
... voice clips would be kept for a "period of time" even if a user deactivated Siri on his or her device.

Well if the clips are truly disassociated, how would they know what files to delete?
 
My only question is why especially since its anonymized - what benefit is there to Apple?

They can have human ears listen to what you said and then see what Siri returned.

So when I say play song "Back in Black" and Siri says: Now playin gall songs, they can figure out why that happened and try to improve the speech recognition.

My success rate with Siri and playing the right song is about 50%. :(
 
I don't see much harm if they are in fact doing what they said. It is also extremely likely Samsung and Google are doing similar with their respective services. The value in having these files, even disassociated, is that you start analyzing trends in usage as well as commands/requests made. It can guide a company on where to improve their product, add new features, or highlight uses people don't seem to be using.

I'm definitely someone who values privacy, and I also have worked in software for 20 years and know exactly how customer data is usually handled on the backend. I really don't have a problem with this. And, honestly, I have no problem with Google's data collection per se. What I dislike is advertising which is their business.

Google does exactly the same thing. As does any similar system. They have to do it, because speech recognition is based upon detection of statistically significant patterns within the signal. The more data you have, the better accuracy you can get out of the system. My source: had a lengthy discussion about it with a Google Voice Search team member.


I could be wrong - but I think I read that Android/Samsung devices do the translation on the phone (IE - you don't need a data connection). So your voice may or may NOT be being sent to a data center. The data request/results could be going between the phone and the data center. But not necc your recorded voice. Someone can correct me if I'm wrong.
 
Given how choosy they can be about content on the App Stores it's kind of funny to imagine how much profanity must be on Apple's servers via Siri...

When I asked Siri where I could get a good old (f-word), she opened Maps, but the result was not really what I expected. A gas station???

Let me add that my wife was present when I made the enquiry; we were testing the Siri functionality on her iPad mini.
 
Why is Apple always on the news for privacy when other companies (Google) have shadier dealings and rarely get called out :-/

Headlines containing the string "Apple" draw more clicks which equals more advertising dollars. There was just that story about how some local cable company was injecting their own advertising into the web sites their clients visited and that somehow turned into the headline "How a banner ad for H&R Block appeared on apple.com—without Apple’s OK". The story had nothing to do with Apple at all.
 
When I asked Siri where I could get a good old (f-word), she opened Maps, but the result was not really what I expected. A gas station???

Let me add that my wife was present when I made the enquiry; we were testing the Siri functionality on her iPad mini.

Was it a truck stop gas station? Siri might have been right on ... :p
 
So Apple has random clips of a voice saying "tell me a joke" and in exchange Siri gets better and better... sound good to me.
 
So what?

Is there any value, privacy-wise, to "honey, I will be late"?

My only question is why especially since its anonymized - what benefit is there to Apple?

If you wrote speech recognition engines, wouldn't you want a huge library of samples to improve your product over time? Think of the immense value of looking up translations that were clearly wrong (nonsensical words) and being able to listen to them and tweaking your engine to recognize them correctly.
 
I could be wrong - but I think I read that Android/Samsung devices do the translation on the phone (IE - you don't need a data connection). So your voice may or may NOT be being sent to a data center. The data request/results could be going between the phone and the data center. But not necc your recorded voice. Someone can correct me if I'm wrong.

Android has the offline voice typing functionality which is based on the patterns mined by their servers. I have no idea what the accuracy compared to the online service is. And I am sure that the search system does not use this offline scanner.
 
I could be wrong - but I think I read that Android/Samsung devices do the translation on the phone (IE - you don't need a data connection). So your voice may or may NOT be being sent to a data center. The data request/results could be going between the phone and the data center. But not necc your recorded voice. Someone can correct me if I'm wrong.
Google's voice recognition is done in the cloud most of the time. There is the ability to download voice recognition for when offline but it's not as accurate (though surprisingly good considering how small it is--22 megabytes for English).

Samsung S Voice requires a net connection always.



Michael
 
There will be two kinds of responses on this thread.

The first will be "big deal". Those are the people who love Apple's walled garden, who don't mind the restrictions and the DRM. They may feel that if you have done nothing wrong you have nothing to fear. They probably use a lot of social media without much thought to privacy.

The second group will be those who avoid Siri and think outside of the walled garden. To them Apple's '1984' ad has now come true in the reverse. They either do not use social media or use it with caution. They may also avoid storing or transferring data on the cloud.

By page four the two sides will be arguing.
 
To be honest I'm quite surprised that Siri requires the voice to be sent for analysis, that Apple even stores the voices at all (no doubt in addition to the text translations), that they store them for so long and that they link multiple requests with a unique identifier.

This all goes to show that whatever we think technology companies are keeping on us they're probably keeping much more. Sure they say that the unique identifier is not your AppleID or email address but did they say that it can't be linked to that if required? I'd be very surprised if it can't, since that unique number is probably produced using an algorithm based on our AppleID and/or other user identifiable information.

It's not like it would make me stop using Siri, but at the same time it's surprising that they store information on us for so long (at least six months with a unique identifier).

It also goes against the notion that older phones like the iPhone 4 aren't powerful enough for Siri if all it's doing is recording and transmitting voice? More likely that Apple stretched the truth and used it as an excuse to encourage people to upgrade.
 
Would there be value to the head of IBM saying "Make appointment first thing tomorrow to sell all of my stock in IBM"?

Nope. First the data is anonymous. So it could be anyone saying it. Second, if someone somehow filtered through the petabytes of audio data, was able to determine a specific user was the CEO of IBM, and then found this request in real time...then it could be of value.

But that would require breaking in to Apple's siri server and analyzing all the data. With a supercomputer it would take weeks just to analyze all of it...but that wouldn't even include actually finding something - as you would have to know what you are looking for.

The real value would be the fact that his data is now saved into his google calendar, which is tied to his account and would be much easier to gain access to.
 
There will be two kinds of responses on this thread.

The first will be "big deal". Those are the people who love Apple's walled garden, who don't mind the restrictions and the DRM. They may feel that if you have done nothing wrong you have nothing to fear. They probably use a lot of social media without much thought to privacy.


And about 1/2 of this group will be people who "will never use Android or Google products because of the invasion of privacy"
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.