Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Its good to see this spreading and being discussed by the media.

Plus it seems most are not comfortable with this move either.

Maybe screeching majority rather than minority?
Edit: upon review the commenter did not state the comment came from Apple.

I appreciate the first two statements. However the third is misleading. I recommend finding the original article and read it a bit more closely. The memo came from National Center for Missing and Exploited Children by a Marita Rodriguez who does not work for Apple. She sent it as a memo. It does not reflect Apple’s views and was likely leaked by an Apple employee who doesn’t.
 
Last edited:
  • Like
Reactions: turbineseaplane
I don’t know if your comment is accurate. The quote you are referencing “screeching voice” was not made by Apple. You should go back an re-read the original article, also written poorly to confuse as if it came from Apple. But that’s how media works.

The screeching quote was made by an employee of National Center for Missing and Exploited Children Marita Rodriguez. She wrote it as a memo to Apple employees. Its wrong to assume this is Apple voice or thoughts. The fact that an Apple employee who received it leaked it leads me to believe that it is not their belief.

I could be wrong or misinformed so please update me as I to would like to read up on this but when has it been said by Apple they scanned iCloud Photos in the past. Also device side hash comparisons is not the same as photo scanning. Its not wrong to make the comparison but its also not the same thing. Its a way of Apple to check for illegal hashes without actually looking at the images. A thing called coupons is involved and once the flags for bad hashes meets the threshold then it has a human review to then verify the images are actually illegal. ****** job in my opinion to have. Also before the human review part the function that compares hashes has a 1 in 1 trillion chance of error in one year. Thats an incredibly high number. Like wrap your brain around it. Winning the lottery is higher and even with those odds Apple still wanted a human to check first before it every went off to any official organization.

Your concerns are warranted and I appreciate your perspective. Please contribute to the discussion. Thank you.
Ok so I apologise it was not by Apple, however it was seemingly received Apple approval as a VP sent it on to their teams at Apple.

I guess it is the people at the top who are deluded and those actually writing the codes etc are possibly against it.

Apple previously stated they were scanning iCloud, I don’t think it was using the same process, but there isn’t much difference at the end of the day. If there is a dodgy photo, it is detected. That isn’t my problem, it’s the on device part I don’t like. If this new Apple system detects a photo, an employee of Apple can then view the photo, a lower quality version, but still creepy. From my understanding, even before the photo is uploaded! A line is crossed at once.

This article from early 2020 has info about Apple scanning icloud. The media does not seem to have picked up on this much and is not asking why the change to on device scanning is needed.

 
I appreciate the first two statements. However the third is misleading. I recommend finding the original article and read it a bit more closely. The memo came from National Center for Missing and Exploited Children by a Marita Rodriguez who does not work for Apple. She sent it as a memo. It does not reflect Apple’s views and was likely leaked by an Apple employee who doesn’t.
Why misleading? I did not insinuate that the statement stems from Apple.

Just be glad people notice. And people express their concerns. I don‘t think it is going to change Apple‘s path, but at least its something
 
Apple needs to make one of its fancy dancy videos that solidly explains how this is all going to work. Their current roll-out of this is a PR disaster. It sounds creepy & even after reading about how it works, I'm still not enthralled with it.
Just get Billie Eilish to mumble something about it into a mic and the community will accept everything Apple says.
 
this IS a backdoor for censorship of any kind, period. It must not be allowed to be implemented since the pattern it searches for is easily changed and then in some countries, gays will be reported, or certain religions or even worse, the opposition...

this must stop before it starts or I'll have bought my last Apple device for sure! Would be a shame!
Right there with you, sir. Already begun shopping around for new stuff.
 
If you don't like this change, just sign-out out of iCloud completely. It's not that inconvenient. There are other methods for syncing contacts, calendars and notes. iMessage and Facetime have seperate logins. App Store as well.

The only inconvenience for me is that I can't use Apple Pay and Find My anymore. Would be nice if you could use those without being signed into the syncing aspect of iCloud.

I was never comfortable syncing everything with iCloud. It's so opaque and there is no way seeing what is really stored by the different apps.
Done and done. I use my phone without an Apple ID, and add notes and reminders manually. I only text or use the phone when I have to, and I try to keep it brief. It’s a shame I have to do this for peace of mind, but hey, gotta ride out the phone payment plan somehow…
 
"Super smart trained AI" - I work with state of the art machine learning models, and even the best of them make the occasional dumb mistakes, because ultimately it is a dumb method still far away from human thinking.

The system is looking at the content. The NeuralHash component (your step 2) works on "features of the image instead of the precise values of pixels," ensuring that "perceptually and semantically similar images" get similar fingerprints. Semantically similar, that is content matching. NeuralHash analyses the image content. If it was only about matching slight modifications, perceptual similarity would be sufficient. NeuralHash does more. Thus the fingerprint is among other things a content summary. A lot depends on the detail here, which in turn depends on the undocumented features Apple is looking for and the undocumented weights and thresholds of the system. "Two pink shapes" is more generic than "two nude humans" is more generic than "two people having sex" is more generic than "a man having sex with a boy" is more generic than "a grey-haired man..." and so on. The more detailed this goes, the closer we get to pixel perfect image comparison. We know Apple does not want that, so some level of genericness is preserved. Step 3 is comparing these image content summaries with the image content summaries from NCMEC.

True. The unspecified threshold is interesting, though. We know more than one matching picture is needed (so Apple won't do anything if they have one match, even if it is a perfect match, which is peculiar in its own right), but we do not know how many. Ten? Two?

This is disingenuous. I also work with state of the art machine learning models, and agree that dumb mistakes happen, but you are describing NeuralHash contrary to what the document you linked to described, as if you have some insider information. Surely, if they are using content matching in terms of matching a certain number of generic features (ala object detection) then what you're saying might apply, but their description of the technology does not describe this in any way whatsoever. It seems the primary purpose of NeuralHash is to be able to detect alterations from the "ground truth" CSAM hash. The language they use does not imply content matching:
semantically similar images have close descriptors in the sense of angular distance or cosine similarity

This sounds more like an NLP technique, in which their embedding network forms something akin to word embeddings (e.g., something like word2vec), where semantic meaning is not in any way tied specifically to contents of the image (e.g., people have sex), but to "image descriptors" (which seems likely to be a vector of integers) representing the overall look of the image (which would be entirely human unreadable). From these embedding descriptors, hashes are computed, which can be used for semantic and perceptual similarity. word2vec's famous example can be helpful here to see what they might be doing: using the vectors for each word, the following equation holds true (where equality here is defined as closest match): king-man+woman = queen. It seems what Apple is doing is finding matches in a similar way, where someone taking some CSAM photo and cropping or distorting it in some way, it can still be understood as derivative of the original (e.g., king-man+woman and queen are not identical, but their word vectors are extremely close).

In this way, they can hash extremely close embeddings using their LSH process to the same value, the likelihood of false positives L at this level is already incredibly low. Thresholding by n decreases likelihood exponentially to L^n.

In other words, although a human might see these as semantically and perceptually similar based on content (e.g., background of blurry male profile looking to the left blurred out, with male in white suit in focus with arms raised diagonally), the algorithm can see that these are very different.
1628798423412.png


First of all: Apple trusts its users so little that it suspects all of them of CSA, and it installs a black box into their personal property to check on them. To Apple, users are potential adversaries, who need to be checked and controlled. Information from Apple to its users must be read with this premise in mind. No claim from Apple should be taken at face value.
Apple's trust of users does not seem to change in any way with this tool, if it's true that they have already been scanning iCloud Photo libraries for CSAM. This method just makes it so that Apple et al won't be looking at original photos if CSAM does exist, but at derivative (extremely low-res) versions of them.

Your description of 6a assumes that all of this is perfectly implemented, without bugs or undocumented backdoors, and that the calculation is honest. There is no reason to make these assumptions. The trillion is hyperbole even under the most generous readings, as user accounts can differ by many orders of magnitude. External experts matter little - Apple picked them, and Apple has posited itself as our adversary. There is no basis of trust to fall back on, not any more. Apple needs to open-source this tool chain, so that we all can see what is going on in there.
I agree there is no reason to make the assumption that there are no bugs; however, assuming there are no undocumented backdoors really is a matter of how much one trusted Apple prior to them implementing this, not on gut-reaction to how one understands how this works or its apparent limitations in regards to privacy.

There are several reasons to assume the calculation is honest and not hyperbole: most obvious, there has been a ton of negative publicity with this announcement, and Apple has an insanely large incentive to get this right. Get the number wrong, and that negative publicity takes a turn for the worst. Second, let's do some quick reasonable calculations: let's say that NeuralHash really sucks, and has a false positive rate of 1 in 1 thousand (it's likely much better than this). As these would be independent events (so you multiply probabilities), with thresholding on just 2 matches, then you have a 1 in 1 million chance of an alert being triggered. It only takes a threshold of 4 with this ridiculously poor NeuralHash to get to 1 in 1 trillion. That doesn't seem to be unreasonable at all. With a NeuralHash false positivity rate of 1 in 10 thousand, it only takes 3 threshold events to get to an overall rate of 1 in 1 trillion.

The matching is described as taking content into account.
It really doesn't seem like it does. See above.

Also, you left out option three - the low-res photo looks like, well, the reviewer is not sure. Is it CSA or not? Are all those people adults? Consenting adults? Might be hard to tell with the blur. Is this a picture of a barely dressed kid or a young adult? If the former, is that legal? The reviewers will have to make decisions that are not nearly as clear cut as you describe. If they decide that they cannot rule out CSA and they would rather have the experts take a look, then we get to...

Step 8 - NCMEC Review
Here all bets are off, as we do not know how this works. If the questionable pics are not variants from those in their database, then they should drop the case. The only damage is several strangers having looked at private pictures. If it is a match, off to the police. What if it is not a match, but the NCMEC reviewer thinks this might be a hitherto unknown case of CSA? Can they ask the police to investigate?

This is the real concern for those who actually hit the threshold. I can't claim to know how this would be handled -- do those at Apple reviewing the photos have access to the ground truth CSAM image? I also can't claim to know how blurry those photos are, and how much Apple truly desires to protect user privacy. This again seems to at least border on a trust issue, which should again not be based on a gut-reaction to our (possibly wrong) understanding of the technique and process, but based on how one felt about Apple prior to the announcement.
 
Last edited:
Ok so I apologise it was not by Apple, however it was seemingly received Apple approval as a VP sent it on to their teams at Apple.

I guess it is the people at the top who are deluded and those actually writing the codes etc are possibly against it.

Apple previously stated they were scanning iCloud, I don’t think it was using the same process, but there isn’t much difference at the end of the day. If there is a dodgy photo, it is detected. That isn’t my problem, it’s the on device part I don’t like. If this new Apple system detects a photo, an employee of Apple can then view the photo, a lower quality version, but still creepy. From my understanding, even before the photo is uploaded! A line is crossed at once.

This article from early 2020 has info about Apple scanning icloud. The media does not seem to have picked up on this much and is not asking why the change to on device scanning is needed.

I apologize if I’m coming off as attacking you. That is not my intent. I just like to work with known facts and like to discuss concerns. Thank you for the article by the way. Oddly they didn’t know what Apple was using at the time but did a better job of describing hashes in fewer words.

About the VP. Maybe, I’m not going to assume anything on how their memo’s work or how they are filtered. This could’ve been an email to that organization she works with internally and may not get or need a VP review. Still to speculative to assume a VP approved and sent it. I think its important to know for sure before stating it like it did happen that way. That’s why I work with what is known or public knowledge.

I don’t want to convince you one way or other is better. Here’s how I see it though. I personally would rather the hash comparisons happen locally on my phone instead of on the server side. Since they are just hashes being compared to other hashes its easier for your device to compare them if they are identical to the once’s on CSAM list. That device is within your possession where a server is kind of one point of entry. I don’t think their servers are going to get compromised anytime soon but if they did you could corrupt a lot more data more easily as apposed to one device at a time kind of attack.

Also there is a threshold that has to be met. I could be just 1 which means they can look at it then. I don’t think you or I think its that low but I guess its possible. Also the needle could move on the threshold either way depending on how effective it is. However it is only reviewed when enough coupons (flagged hashes) have been uploaded to iCloud (Only on upload). Apple states the system is affective to 1 in 1 trillion per year and that is per hash. Chances of it making an error are possible but not very. So it has to make that error many times, this is an assumption because I‘m assuming the threshold is higher than 1, of the 1 in 1 trillion. So no Apple employee even looks at the image until that threshold is met. So even if it was 2 you’d have to have it error twice before someone would even view the image that wasn’t illegal. From my point of view, which is just an opinion, some poor group of employees at Apple are having to look at only illegal photos since the chances of a hash getting through that system that isn’t illegal is pretty dang low.

All of this is based off of what Apple is telling us and if its accurate. My guess is if its not accurate they will tweak it to be accurate or update their metrics. All this because they don’t actually want to see your photos. They just want to catch people who are committing the illegal act.

Also in that article you provided it stated that the senator told them if they don‘t do it they will make them do it.
 
Why misleading? I did not insinuate that the statement stems from Apple.

Just be glad people notice. And people express their concerns. I don‘t think it is going to change Apple‘s path, but at least its something
My apologies. Thank you for correcting me. I realize you did not insinuate it stems from Apple. Again thank you. It didn’t state what you really meant there but still not fair to you.
 
Just a little update on those naive people suggesting it was only pictures in iCloud. Its more sinister than that, it usurps a users own hardware, taking up processing power etc., to check on photos PRIOR to being sent to iCloud.

So its YOUR HARDWARE that is under surveillance and not necessarily photos in iCloud as its a pre iCloud function.

NFC
 
Apple knows they won’t walk back, nor bother to have a discussion. Either apple is evil or someone is forcing apple.
I wondered this myself. Secret court decision? Or they’re just evil crooks. Either is honestly easy for me to believe at this point. Neither will earn any respect back from me.
 
This is disingenuous. I also work with state of the art machine learning models, and agree that dumb mistakes happen, but you are describing NeuralHash contrary to what the document you linked to described, as if you have some insider information. Surely, if they are using content matching in terms of matching a certain number of generic features (ala object detection) then what you're saying might apply, but their description of the technology does not describe this in any way whatsoever. It seems the primary purpose of NeuralHash is to be able to detect alterations from the "ground truth" CSAM hash. The language they use does not imply content matching:


This sounds more like an NLP technique, in which their embedding network forms something akin to word embeddings (e.g., something like word2vec), where semantic meaning is not in any way tied specifically to contents of the image (e.g., people have sex), but to "image descriptors" (which seems likely to be a vector of integers) representing the overall look of the image (which would be entirely human unreadable). From these embedding descriptors, hashes are computed, which can be used for semantic and perceptual similarity. word2vec's famous example can be helpful here to see what they might be doing: using the vectors for each word, the following equation holds true (where equality here is defined as closest match): king-man+woman = queen. It seems what Apple is doing is finding matches in a similar way, where someone taking some CSAM photo and cropping or distorting it in some way, it can still be understood as derivative of the original (e.g., king-man+woman and queen are not identical, but their word vectors are extremely close).

In this way, they can hash extremely close embeddings using their LSH process to the same value, the likelihood of false positives L at this level is already incredibly low. Thresholding by n decreases likelihood exponentially to L^n.

In other words, although a human might see these as semantically and perceptually similar based on content (e.g., background of blurry male profile looking to the left blurred out, with male in white suit in focus with arms raised diagonally), the algorithm can see that these are very different.
View attachment 1817879


Apple's trust of users does not seem to change in any way with this tool, if it's true that they have already been scanning iCloud Photo libraries for CSAM. This method just makes it so that Apple et al won't be looking at original photos if CSAM does exist, but at derivative (extremely low-res) versions of them.


I agree there is no reason to make the assumption that there are no bugs; however, assuming there are no undocumented backdoors really is a matter of how much one trusted Apple prior to them implementing this, not on gut-reaction to how one understands how this works or its apparent limitations in regards to privacy.

There are several reasons to assume the calculation is honest and not hyperbole: most obvious, there has been a ton of negative publicity with this announcement, and Apple has an insanely large incentive to get this right. Get the number wrong, and that negative publicity takes a turn for the worst. Second, let's do some quick reasonable calculations: let's say that NeuralHash really sucks, and has a false positive rate of 1 in 1 thousand (it's likely much better than this). As these would be independent events (so you multiply probabilities), with thresholding on just 2 matches, then you have a 1 in 1 million chance of an alert being triggered. It only takes a threshold of 4 with this ridiculously poor NeuralHash to get to 1 in 1 trillion. That doesn't seem to be unreasonable at all. With a NeuralHash false positivity rate of 1 in 10 thousand, it only takes 3 threshold events to get to an overall rate of 1 in 1 trillion.


It really doesn't seem like it does. See above.



This is the real concern for those who actually hit the threshold. I can't claim to know how this would be handled -- do those at Apple reviewing the photos have access to the ground truth CSAM image? I also can't claim to know how blurry those photos are, and how much Apple truly desires to protect user privacy. This again seems to at least border on a trust issue, which should again not be based on a gut-reaction to our (possibly wrong) understanding of the technique and process, but based on how one felt about Apple prior to the announcement.
OK - you seem knowledgeable about this. Will the false positives have features similar to child porn (e.g., exposed skin, certain poses, etc.), which means the false positives are likely to be sensitive in nature? Your example was about correct rejections, not false positives. I cannot imagine a false positive not having features in common with the target CSAM material, which means the pictures will be sensitive.

Also, why do you assume the false positives will be statistically independent? Suppose, as people do, somebody takes a series of pictures. Isn't it true if one is flagged the others are likely to be flagged given they are similar? Also, I note that the false positive likelihood goes up with the number of pictures in one's library, so the 1 in a trillion cannot be accurate because it depends on the number of pictures. infinite picture number = infinite likelihood of a false positive.

These are the reasons I do not trust Apple's estimate of a 1-in-a-trillion chance of a false positive. The seem to be based on abstract, very decontextualised, back-of-the-napkin calculations rather than taking into account the actual statistical properties of photos and the correlations among photos that people take. The truth is that unless Apple has already used this on the iCloud server side with real photos, they simply can't know.
 
Last edited:
OK - you seem knowledgeable about this. Will the false positives have features similar to child porn (e.g., exposed skin, certain poses, etc.), which means the false positives are likely to be sensitive in nature? Your example was about correct rejections, not false positives. I cannot imagine a false positive not having features in common with the target CSAM material, which means the pictures will be sensitive.

There's really no way to know what false positives would actually look like. While it seems Apple is doing something clever here, we could at least guess at what false positives might look like: if it was just perception that was being matched (i.e., overall look is the same), then the photo I attached above would probably be a false positive; however, if just semantic meaning was matched, that same photo might not be matched (if the age of the male in the photo could be reasonably understood).

Using a method similar to what I described above, it would have to be near-exact* semantically and perceptually -- meaning exposed skin and certain poses wouldn't be enough, it would need to be exposed skin of a child who looks nearly identical to the CSAM photo who is posed in a nearly identical way (and even this is simplifying it quite a bit).

*Alternatively, you could look toward adversarial techniques used to thwart algorithms made for object detection, etc., to gain some understanding: images generated for the express purpose of fooling an object detector into thinking it's seeing something that's not really there (or the converse, perturbing an image so that the object detector is unable to see something that actually is there). The most effective of these typically look like static (nowhere even close to the target). In that case, false positives wouldn't be sensitive in nature, but this is all really conjecture.

Also, why do you assume the false positives will be statistically independent? Suppose, as people do, somebody takes a series of pictures. Isn't it true if one is flagged the others are likely to be flagged given they are similar? Also, I note that the false positive likelihood goes up with the number of pictures in one's library, so the 1 in a trillion cannot be accurate because it depends on the number of pictures. infinite picture number = infinite likelihood of a false positive.

This is a good point that I think several people have made, and I don't think I've seen anyone really say anything meaningful against this notion, so I'll give it a shot here. If you look at the protocol described here, paying special attention to PSI-CA, you'll see that this is not an issue.

The cardinality of matches is the intersection of all of your photos with unique hashes of the set of CSAM photos.

Suppose you have N different photos (unique id's) that all have a matching hash of a single CSAM photo x out of all the CSAM photos X:

So we have:
1628878081392.png
, where each n in N is triple of the hash, id, and associated data of a photo. Note there are Nelements in the the set.
The intersection operation they describe results in:
1628878155255.png
--- which matches all hashes in N matching unique hashes in X, providing a cardinality of only a single match against the threshold.

This is why I stated these have statistical independence, as well as why multiple copies of similar pictures (or multiple actual copies of the same CSAM photo) will only count against 1 for the threshold. You'll need to have at least the threshold number of false positives against unique CSAM photos for an alert to be triggered.

These are the reasons I do not trust Apple's estimate of a 1-in-a-trillion chance of a false positive. The seem to be based on abstract, very decontextualised, back-of-the-napkin calculations rather than taking into account the actual statistical properties of photos and the correlations among photos that people take. The truth is that unless Apple has already used this on the iCloud server side with real photos, they simply can't know.

Hopefully the above will shed some light on how this seems entirely reasonable. Not all questions are answered, but the 1 in 1 trillion chance of a false positive doesn't seem far-fetched at all.

Edit: I just saw that they threshold to something like 30 matches. This is an insane, unnecessary number of matches even with a crappy hashing algorithm --- I have no reason at all to think anyone will ever be falsely flagged.
 
Last edited:
  • Like
Reactions: VulchR
There's really no way to know what false positives would actually look like. While it seems Apple is doing something clever here, we could at least guess at what false positives might look like: if it was just perception that was being matched (i.e., overall look is the same), then the photo I attached above would probably be a false positive; however, if just semantic meaning was matched, that same photo might not be matched (if the age of the male in the photo could be reasonably understood).

Using a method similar to what I described above, it would have to be near-exact* semantically and perceptually -- meaning exposed skin and certain poses wouldn't be enough, it would need to be exposed skin of a child who looks nearly identical to the CSAM photo who is posed in a nearly identical way (and even this is simplifying it quite a bit).

*Alternatively, you could look toward adversarial techniques used to thwart algorithms made for object detection, etc., to gain some understanding: images generated for the express purpose of fooling an object detector into thinking it's seeing something that's not really there (or the converse, perturbing an image so that the object detector is unable to see something that actually is there). The most effective of these typically look like static (nowhere even close to the target). In that case, false positives wouldn't be sensitive in nature, but this is all really conjecture.



This is a good point that I think several people have made, and I don't think I've seen anyone really say anything meaningful against this notion, so I'll give it a shot here. If you look at the protocol described here, paying special attention to PSI-CA, you'll see that this is not an issue.

The cardinality of matches is the intersection of all of your photos with unique hashes of the set of CSAM photos.

Suppose you have N different photos (unique id's) that all have a matching hash of a single CSAM photo x out of all the CSAM photos X:

So we have: View attachment 1818304 , where each n in N is triple of the hash, id, and associated data of a photo. Note there are Nelements in the the set.
The intersection operation they describe results in: View attachment 1818306 --- which matches all hashes in N matching unique hashes in X, providing a cardinality of only a single match against the threshold.

This is why I stated these have statistical independence, as well as why multiple copies of similar pictures (or multiple actual copies of the same CSAM photo) will only count against 1 for the threshold. You'll need to have at least the threshold number of false positives against unique CSAM photos for an alert to be triggered.



Hopefully the above will shed some light on how this seems entirely reasonable. Not all questions are answered, but the 1 in 1 trillion chance of a false positive doesn't seem far-fetched at all.

Edit: I just saw that they threshold to something like 30 matches. This is an insane, unnecessary number of matches even with a crappy hashing algorithm --- I have no reason at all to think anyone will ever be falsely flagged.
Thank you - I will ponder this. My initial reaction is that it is likely there are multiple pictures of CSAM that are similar in a series, plus multiple photos in a user's library that are similar in series, so I am not sure the multiple-hit (30x) criterion gives me that much confidence, but maybe the odds are genuinely low. Anyway, you have been clearer than Apple has. I am not entirely convinced about the principle or the implementation of this scheme, but your post lowered my alarm somewhat. Thank you for going to the effort of replying.
 
  • Like
Reactions: januarydrive7
As far as I'm concerned that IS questionable material. Why in the world would a child send a nude of themself? Something is wrong with the parent if that's going on.
A person is considered a child until 18, there are plenty of teens that send nudes, more commonly known as sexting...

Minnesota Prosecutor Charges Sexting Teenage Girl With Child Pornography

Will these types of cases increase once Apple starts scanning photos in messages? How much longer until they require all apps to scan photos they handle as a requirement to being on the App Store?

It's all fine in theory, but how long until the government passes legislation requiring this tech to be used in much worse ways in the name of "protecting the children"?

They could require devices to report all suspect images to the authorities as a way to catch things before they spread, and so on...

Once the can of worms is opened you can't put them back.
 
Last edited:
  • Disagree
Reactions: Maconplasma
Simple: don't upload child porn to iCloud and you won't be tagged for human review. Stop worrying that this is enabling some backdoor for governments to spy on you, it doesn't.
Okay, then I'm sure you'd have absolutely no issue if a police officer randomly came to your home without any kind of warrant and demanded to search your house for contraband.

That isn't different than your device scanning all of your photos...
 
Thank you - I will ponder this. My initial reaction is that it is likely there are multiple pictures of CSAM that are similar in a series, plus multiple photos in a user's library that are similar in series, so I am not sure the multiple-hit (30x) criterion gives me that much confidence, but maybe the odds are genuinely low. Anyway, you have been clearer than Apple has. I am not entirely convinced about the principle or the implementation of this scheme, but your post lowered my alarm somewhat. Thank you for going to the effort of replying.
If they are utilizing a fuzzy hashing algorithm (for both the CSAM database and on our devices), which seems to be likely given what we know about PhotoDNA, then a series photos that are similar would all hash the same (so multiple CSAM photos that are nearly identical would all hash to a single value, as well).

Thank you for your civility in discussion! --- there's been quite a lack of that in all these threads this past week
 
  • Like
Reactions: VulchR
Okay, then I'm sure you'd have absolutely no issue if a police officer randomly came to your home without any kind of warrant and demanded to search your house for contraband.

That isn't different than your device scanning all of your photos...
This analogy has been used over and over again and is a terrible match for what they're doing.

Here's a better (yet still imperfect) analogy you can use:

"Okay, then I'm sure you'd have absolutely no issue if a customs agent decided to search your luggage after an x-ray scan revealed what looked like 30 bombs."

The imperfection here lies in that the customs agent would look through everything in the luggage, not just the things that looked like bombs. Also, the customs agent is able to look at the actual items in the luggage, rather than low-res derivatives.
 
If they are utilizing a fuzzy hashing algorithm (for both the CSAM database and on our devices), which seems to be likely given what we know about PhotoDNA, then a series photos that are similar would all hash the same (so multiple CSAM photos that are nearly identical would all hash to a single value, as well).

Thank you for your civility in discussion! --- there's been quite a lack of that in all these threads this past week
Ah. True if the hashing achieves a form of dimension reduction. More to for me ponder.

EDIT: Honestly I think Apple would be better off actually demonstrating this algorithm, obviously not using CSAM material, but, say, pictures of animals so that people could see in a concrete way what the probability of a false positive is and what the false positives look like in comparison to the target set. Indeed they could scan for missing/kidnapped pets if they wanted, but that would be mission creep, the possibility of which concerns many people.
 
Last edited:
Ah. True if the hashing achieves a form of dimension reduction. More to for me ponder.

EDIT: Honestly I think Apple would be better off actually demonstrating this algorithm, obviously not using CSAM material, but, say, pictures of animals so that people could see in a concrete way what the probability of a false positive is and what the false positives look like in comparison to the target set. Indeed they could scan for missing/kidnapped pets if they wanted, but that would be mission creep, the possibility of which concerns many people.
At this point, I'd be surprised if they don't showcase it in some way soon. Then again, they've been relentlessly saying that they've designed this so that it only does work with CSAM (there might be other layers to that restriction besides the database itself). They might not be able to check against anything other than CSAM --- all conjecture at this point.
 
  • Like
Reactions: VulchR
At this point, I'd be surprised if they don't showcase it in some way soon. Then again, they've been relentlessly saying that they've designed this so that it only does work with CSAM (there might be other layers to that restriction besides the database itself). They might not be able to check against anything other than CSAM --- all conjecture at this point.
Honestly it sounds to be me like they are using an algorithm that is not specific to either the target set (CSAM) or the nature of the files being scanned, even if the implementation is limited to CSAM. No doubt they have developed an algorithm that has been optimised for CSAM, but it should be possible to do something similar for animals (or places, machines, snowflakes, fractals, etc.). Even if it does not precisely match what they are doing with CSAM, it would be illustrative. You must admit that even you are sometimes taking an educated guess about the algorithm and its results (e.g., nature of the false positives).
 
Honestly it sounds to be me like they are using an algorithm that is not specific to either the target set (CSAM) or the nature of the files being scanned, even if the implementation is limited to CSAM. No doubt they have developed an algorithm that has been optimised for CSAM, but it should be possible to do something similar for animals (or places, machines, snowflakes, fractals, etc.). Even if it does not precisely match what they are doing with CSAM, it would be illustrative. You must admit that even you are sometimes taking an educated guess about the algorithm and its results (e.g., nature of the false positives).
I fully admit that I'm just making guesses, however educated or not they may be. I agree that they ought to be able to do it for something else, barring additional unknown measures they've put in place to make it only applicable for CSAM. The biggest plus for this would be that they could showcase that it only will match near-exact photos, not just photos with high similarity.
 
  • Like
Reactions: VulchR
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.