This is disingenuous. I also work with state of the art machine learning models, and agree that dumb mistakes happen, but you are describing NeuralHash contrary to what the document you linked to described, as if you have some insider information. Surely,
if they are using content matching in terms of matching a certain number of generic features (ala object detection) then what you're saying might apply, but their description of the technology does not describe this in any way whatsoever. It seems the primary purpose of NeuralHash is to be able to detect alterations from the "ground truth" CSAM hash. The language they use does not imply content matching:
This sounds more like an NLP technique, in which their embedding network forms something akin to word embeddings (e.g., something like
word2vec), where semantic meaning is not in any way tied specifically to contents of the image (e.g., people have sex), but to "image descriptors" (which seems likely to be a vector of integers) representing the overall look of the image (which would be entirely human unreadable). From these embedding descriptors, hashes are computed, which can be used for semantic and perceptual similarity. word2vec's famous example can be helpful here to see what they might be doing: using the vectors for each word, the following equation holds true (where equality here is defined as closest match): king-man+woman = queen. It seems what Apple is doing is finding matches in a similar way, where someone taking some CSAM photo and cropping or distorting it in some way, it can still be understood as derivative of the original (e.g., king-man+woman and queen are not identical, but their word vectors are
extremely close).
In this way, they can hash
extremely close embeddings using their LSH process to the same value, the likelihood of false positives
L at this level is already incredibly low. Thresholding by
n decreases likelihood exponentially to
L^n.
In other words, although a human might see these as semantically and perceptually similar based on content (e.g., background of blurry male profile looking to the left blurred out, with male in white suit in focus with arms raised diagonally), the algorithm can see that these are
very different.
View attachment 1817879
Apple's trust of users does not seem to change in any way with this tool, if it's true that they have
already been scanning iCloud Photo libraries for CSAM. This method just makes it so that Apple et al won't be looking at original photos if CSAM does exist, but at derivative (extremely low-res) versions of them.
I agree there is no reason to make the assumption that there are no bugs; however, assuming there are no undocumented backdoors really is a matter of how much one trusted Apple
prior to them implementing this, not on gut-reaction to how one understands how this works or its apparent limitations in regards to privacy.
There are several reasons to assume the calculation is honest and not hyperbole: most obvious, there has been
a ton of negative publicity with this announcement, and Apple has an insanely large incentive to get this right. Get the number wrong, and that negative publicity takes a turn for the worst. Second, let's do some quick reasonable calculations: let's say that NeuralHash
really sucks, and has a false positive rate of 1 in 1 thousand (it's likely much better than this). As these would be independent events (so you multiply probabilities), with thresholding on just
2 matches, then you have a 1 in 1 million chance of an alert being triggered. It only takes a threshold of
4 with this ridiculously poor NeuralHash to get to 1 in 1 trillion. That doesn't seem to be unreasonable at all. With a NeuralHash false positivity rate of 1 in 10 thousand, it only takes 3 threshold events to get to an overall rate of 1 in 1 trillion.
It really doesn't seem like it does. See above.
This is the real concern for those who actually hit the threshold. I can't claim to know how this would be handled -- do those at Apple reviewing the photos have access to the ground truth CSAM image? I also can't claim to know how blurry those photos are, and how much Apple truly desires to protect user privacy. This again seems to at least border on a trust issue, which should again not be based on a gut-reaction to our (possibly wrong) understanding of the technique and process, but based on how one felt about Apple prior to the announcement.