Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
You had me going until

Absolutely. It's the same with translation.

Old-school algorithms tried to substitute words and did some basic grammar correction, but the results weren't great and it's awfully difficult to do it well.

The new way to do this is with Big Data - for example, Google Translate now works by searching through enormous datasets of documents and recognised translations to statistically determine the best translation. These results end up being a lot better, and can easily adapt and improve with more and better data.

Voice recognition works in the same way. You're trying to match a variable sample (the users spoken words) to a specific action. It needs a bank of samples and correct actions to try and match against (and the larger the better).

I'm less concerned with Apple doing this than Google doing this - Apple's business case for doing this stops at making the product better. They don't try and extract any additional revenue from that information itself. I'm not saying Google necessarily do that; but they would have much more of a business case.
(emphasis mine)

I was with you until that last statement. You have no idea why Apple is doing this, in fact none of us knows. Your attribution of altruism to Apple is naive at best. You do yourself a disservice by mixing articulate ideas in the same post with what amounts to fanboy hopefulness. Apple could be operating exactly as you speculate, or they could, to varying degrees be doing something nefarious. We just don't know.

There is one thing I would like answered. I don't have an iPhone but my wife does and we share an iPad. From the Wired article:

"Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes.

“Apple may keep anonymized Siri data for up to two years,” Muller says “If a user turns Siri off, both identifiers are deleted immediately along with any associated data.”


So if I turn off Siri in month 8, how does Apple know which data to delete?

That's not an accusation of malfeasance. I'm genuinely curious.
 
I find it quite hypocritical that many of you asking "so what's the big deal if they don't store user info along with the recording", while lots of discussions here are condemning Google when they do the same.

Having said that, finally found some use for Siri and have some fun with it! :)

There's nothing hypocritical about a group of people holding diverse, or even divergent, opinions on a subject.

Hypocrisy is when a single person holds divergent opinions on a subject. Even that can be tricky to judge if there are crucial details that diverge between the two instances.
 
For those wondering what Google does.
Personalized Voice Recognition and your privacy
Google takes your privacy seriously. Information that you provide to Google often helps improve product performance. However, some people value greater privacy over performance. Therefore, personalized speech recognition is an opt-in service that you can choose not to use if you prefer.

To opt in, go to Settings > Language & input > Voice Search, then check the box next to Personalized Recognition.

What is the benefit of personalized voice recognition?

Knowing what you said in the past allows Google to build specialized models that match your voice and your words. Over time, this allows Google to improve the speech recognition accuracy for you. To do that, Google keeps samples of what you said in the past.

Why does Google link you to your voice recordings?

Normally, all saved speech samples remain anonymous. In other words, Google stores millions of voice recordings with no way of telling who was speaking. When you sign up for personalized speech recognition, Google creates an electronic key that links your speech samples with your Google Account. Google uses this key to access voice samples and improve recognition of your speaking voice.

How is your data protected?

The electronic keys that link your samples to your account are designed to be accessed by machines, not people. Very few people within Google can access these keys, and only after strict vetting. Your customized voice recognition models are binary files also designed for use by machines.

What if you change your mind?

To opt out of personalized voice recognition on your Android device, go to Settings > Language & input > Voice Search, then uncheck the box next to Personalized Recognition.

Source
 
Most of what what you are worried about is already available in the cloud (they already host some of your email and your entire device iCloud backup). Besides, storing a raw voice clip with no identifier to you is not an efficient way to snoop on somebody. If they reverse/break the unique identifier to ID you, they would be liable of breaking the agreement of anonymized data. After 6 months there isn't even a key to link clips to the same user. If they didn't care about the agreement, the would just grab your SMS, emails and calendar directly.
There needs to be a balance of allowing companies to collect research data, and protecting users. Given the nature of voice recognition services, this seems to be a fair arrangement.

Its being used as test data. Any connection between the voice and to identify the source person is removed.

Every software company retains real data for future development / enhancing the products.

Nothing to see here.

Yes, as long by "track what was said back to a user" you meant apple can track data point H1x00Xf11. It's an anonymized data point, nothing more. Just like apple's ad tracker service which actually increases user privacy by giving just an anonymized identifier instead of something like a device id number which can be tracked. They don't and can't track it down to John H. Somebody.

And yes its anonymous. They create anonymous profiles per device that isn't associated to your actual physical identity or even your iTunes account. Didn't you read the article? Its not like Apple can say "John Jacob Smith IV from Louisville KY living at 333 Livington Rd asked Siri for directions to Florida." Lol thats not how it works.

Ok, so it's completely anonymous and can never be tracked back to you or you iPhone? If that's the case, how does the below quote work if they can't link your phone/your id with your previously recorded voice input data?:

From the article:
“If you turn off Siri, Apple will delete your User Data, as well as your recent voice input data,” Apple’s privacy statement reads. “Older voice input data that has been disassociated from you may be retained for a period of time to generally improve Siri and other Apple products and services.”

Just to clarify, I don't care about this or the data Google collects about me. Like written here above there are so many collecting so much data about you that you need to live in a cave if you get paranoid about stuff like that.

I still find it a bit funny that the tin foils who bash on Google and their anonymous data collection are completely fine with Apples dito. And how about all of your iCloud e-mails, iCloud calendar bookings and your iMessages. Or how about music preferences when you get Genious lists? You don't think Apple scans through them?

No need to answer on the last ones, I guess you think it's good, because Apple can then tailor info and app suggestions etc. to better suit you. It's worse when Google does the same because they customize the ads so you get ads that might interest you and trick you into bying stuff. Google is the devil and Apple is all good...
 
Google takes your privacy seriously

Thats why they had code on google.com bypassing Safari's security measures for there benefit. They are dirty company and wouldn't trust anything they say or do.
 
Of course if this was Google I would imagine that it would be a big deal. :confused:

Absolutely

Apple can do no wrong.

Siri works so poorly with its voice recognition I'm not concerned.

Probably 1/2 of what's stored is Siri garble

Google voice on the other hand

Amazing :bow:
 
You guys are confused. If the data is does not have a link back to a person, then it is not personal information and it does not fall under any "privacy" policies.

Privacy only applies to information which can be used to identify a person directly.

If you are doing something illegal then don't search or dictate anything about it with SIRI. Apple is taking all the measures legally required to disassociate the voice clips from your account but they are not responsible for the contents of those clips. You are responsible for them.

If you are not comfortable with this, don't use the service.
 
Of course if this was Google I would imagine that it would be a big deal. :confused:

Here's the problem: All these companies, no matter whether the do something that is objectively Ok or not, have to but what they do into non-threatening words so that the population stays calm. Whether they are up to no good or not doesn't make a difference. But by doing so, they don't give people the information they need.

"Anonymized" can mean so many things. When you post here as AppleScruff1, I have no idea who you are, where you live, what you look like and so on. So we could say you are anonymous. On the other hand, there are hundreds of posts under the same name, and I assume they are by the same person. So I can look at your posts in the context of other posts, and you are not anonymous. However, if you post on some other site under a different name (say if you are a member of the South Californian Rabbit Breeders, or a member of the Italian Olympic Rowing team, and frequently post on their websites) I wouldn't know, so that again is anonymous.

If what Apple stores is a voice clip with no information associated with it, so if there is no way to know that clip 1 and clip 2 come from the same person, that would be fine with me. From what I hear about Google, they do connect pieces about the same person. If they know ten thousand pieces of information that all belong to the same person, you, and the only thing they don't have is your name and address, that's not anonymous in a practical sense.
 
This possibility has always worried me, and is why I don't use Siri that often. But two years? What could Apple need with our voice clips for two years? Maybe to tune the voice recognition algorithms, but it's still a violation of our privacy.
 
If what Apple stores is a voice clip with no information associated with it, so if there is no way to know that clip 1 and clip 2 come from the same person, that would be fine with me. From what I hear about Google, they do connect pieces about the same person. If they know ten thousand pieces of information that all belong to the same person, you, and the only thing they don't have is your name and address, that's not anonymous in a practical sense.

Ok, if it's a fact that Apple can't connect any of your clips with you/your phone and none of your individual clips with eachother, how come they can do this?:

From the Wired article:
“If you turn off Siri, Apple will delete your User Data, as well as your recent voice input data,” Apple’s privacy statement reads. “Older voice input data that has been disassociated from you may be retained for a period of time to generally improve Siri and other Apple products and services.”

If they don't know which clips you uploaded, how can they delete them if you disable Siri?

And about Google being open with keeping track of you and your clips, they also are very specific about how you shall act if you want to disable that function and make you completely anonymous. I don't think that's the case with Apple, options for the user is not their strong point.
 
<snip>From what I hear about Google, they do connect pieces about the same person. If they know ten thousand pieces of information that all belong to the same person, you, and the only thing they don't have is your name and address, that's not anonymous in a practical sense.


Firstly, I do agree with your post, but just wanted to query the bit I quoted above.

Taking on board the fact that Goole is able to link together pieces of information about you, creating a 'digital fingerprint', there are three vital points:

1) It's still secure - it isn't given, sold etc to anyone.
2) It'd be against Google's business interests to allow anyone to get hold of said information.
3) Google employees dont have free access to data - on the contrary, nobody at Google gets access to the 'profile' of a person - EVER.
4) Its optional. You can opt out of having Google store things such as search history, just as you can disable Siri to remove your data.

People seem to think that ALL of Google's services have an ulterior motive to 'steal' your information, when we know that a vast majority of them exist purely for convenience and to better their offerings. They are painted as some evil empire by most people on MacRumors and its pretty crazy that a collective group can reach some really stupid and wild conclusions.
 
Is Siri used for voice recognition? If it is, that means all of the texts and emails you've written by speaking are saved. That's a lot of sensitive information.
 
This is a waste of resources to be arguing about anonimized data. I understand that it might not all be anonimized because the details (names, places, times) are being recorded but as long as its not being shared or sold, who the hell cares? Besides, Apple has made it very clear that their intention is to improve Siri functionality while maintaining privacy. I doubt Apple gives three flying f***s about the weather in your area. And all this is is just some stupid people that want any opportunity to sue Apple for breach of privacy. If you use an Apple product and agree to submitting diagnostic and usage reports, you're still sharing your info even if you don't use Siri.
 
Well if the clips are truly disassociated, how would they know what files to delete?
The clips are disassociated from any identifier pointing to a "who". It still retains it's date stamp.

I want royalties for keeping mechanical sync copies of my voice

where's the RIAA ?! :mad:
Sorry, RIAA only sues consumers, not sues on their behalf.
You're right. With all phone calls and emails being monitored by our employers and government; and the telecoms, Google, and Facebook tracking our every move, why would there be specific concern about Siri?
I see Siri as a safety. It can be bad enough sometimes that it wouldn't be admissible in court!

"Find Mace Street".
> Found Mesa Avenue in *town miles away*

"Find Mace Street!"
> Locating NAACP

X(
Google will sell your soul to the devil!

Not /s
After all, everyone knows Google will merely ADVERTISE your soul to the devil. :)
(emphasis mine)

I was with you until that last statement. You have no idea why Apple is doing this, in fact none of us knows. Your attribution of altruism to Apple is naive at best. You do yourself a disservice by mixing articulate ideas in the same post with what amounts to fanboy hopefulness. Apple could be operating exactly as you speculate, or they could, to varying degrees be doing something nefarious. We just don't know.

There is one thing I would like answered. I don't have an iPhone but my wife does and we share an iPad. From the Wired article:

"Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes.

“Apple may keep anonymized Siri data for up to two years,” Muller says “If a user turns Siri off, both identifiers are deleted immediately along with any associated data.”


So if I turn off Siri in month 8, how does Apple know which data to delete?

That's not an accusation of malfeasance. I'm genuinely curious.

After six months, the data isn't associated anymore, therefore not deleted. They only delete associated data, according to that statement.
 
That's because with google, it wouldn't be anonymous,

Wrong

and it'd be sold to advertisers with all your online history,

Wrong

and presented to the gov for their legal scrutiny with a friggin bow on it.

And wrong.

impressive, three claims and three wrongs

First of all, google wouldn't anonamize it. Secondly, they'd be selling that information to anyone willing to pay, for profit.

Ant that, two claims and two wrongs. Close second place

And now, back to the thread, no big deal.
 
Ok, if it's a fact that Apple can't connect any of your clips with you/your phone and none of your individual clips with eachother, how come they can do this?:

From the Wired article:
“If you turn off Siri, Apple will delete your User Data, as well as your recent voice input data,” Apple’s privacy statement reads. “Older voice input data that has been disassociated from you may be retained for a period of time to generally improve Siri and other Apple products and services.”

If they don't know which clips you uploaded, how can they delete them if you disable Siri?

And about Google being open with keeping track of you and your clips, they also are very specific about how you shall act if you want to disable that function and make you completely anonymous. I don't think that's the case with Apple, options for the user is not their strong point.
Ok, let me try to explain it in simple terms that you can understand.

When you turn on and use Siri for the first time, you send a voice clip to Siri servers with a "0" or NULL ID number. That triggers the server at Apple to generate a new unique random ID which is returned to the phone with the results of your question or request. That number is then stored on your phone so that when you send another request, it will be stored with that number again because you send that number with subsequent request.

Is that clear to you now?

If you disable Siri, a request with that number is sent to the servers stating that you are disabling Siri which they can use to delete any data that is still associated with that number. Your phone will also forget that number so that if you turn it back on again, you will get a new number.
 
I agree with those that are saying this isn't an issue, but

I don't understand how it can be totally anonymous

If I am a first time use of Siri

I ask it a question and the Siri back end assigns an anonymous id to my request
How does it know where to send this request back to?, it must know it came from my device

If it is truly anonymous and is there to help improve the Siri experience for me (better recognise my voice like mentioned earlier), what happens when I get a new device? has all this been wasted and I need to train it again to recognise me
 
Google does exactly the same thing. As does any similar system. They have to do it, because speech recognition is based upon detection of statistically significant patterns within the signal. The more data you have, the better accuracy you can get out of the system. My source: had a lengthy discussion about it with a Google Voice Search team member.

such discussion is true for any scientific analysis & software development... this doesn't surprise me, nor scare me about what Siri does with my voice.
 
I can't believe people willingly purchase a product that includes an optional service that collects and transmits your personal data. Your data is your life. Take steps to protect yourself! Don't let Google do this! They're clearly, objectively, irredeemably evil.
:p

You mean like email? Every email you send is up there in the cloud forever. If you don't save it someone else might.
 
If they don't know which clips you uploaded, how can they delete them if you disable Siri?

I obviously don't know the specific implemtation, but this is not a smoking gun. There are many ways that this can be done. For example it would be simple for you phone to generate (or at least be aware of) the unique token. That way when your phone sends a deactivation request it just needs to include the token to be deleted. Apple doesn't needs to maintain this link between them.
I would assume that this is the same process that is used to keep a consistent token between sessions.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.