Apple Outlines Security and Privacy of CSAM Detection System in New Document

januarydrive7 · Aug 14, 2021

CriticalThoughtDrop said:
Sort of. Technically, it’s the Hamming Distance. The distinction between nearly identical and merely similar is most likely a few bits at best. Totally configurable by Apple and not likely very accurate based on:

Nowhere do they say its hamming distance, which would be a pretty poor implementation of this -- they're likely doing fuzzy hashing on the embedded vectors.

CriticalThoughtDrop said:
1) Mobile device processing/battery limitations. Apple has a good neural engine, but it’s still edge processing.

2) The fact they need to open it up to 30 matches to meet their false positive requirements. I ran some quick probability estimates, but there’s still too many assumptions and statistical estimation of perceptual hashes are not my area of expertise, so I’m likely am missing something. I’d like to see the real math direct from Apple, but the performance does not appear confidence inspiring.

3) Because of the backlash and multiple AI experts calling, most likely, BS on the 1 in a trillion claim, Apple added a middle step between on device hash matching and human review. There’s now a larger, higher performing, independent (at least so they claim) perceptual hash algorithm that verifies the edge hashing and runs in the cloud (I.e. big, traditional servers). Only if that also matches, will a human review it. If your edge processing is good enough, you don’t need that; it should be obvious Apples is not.

I posted in another thread here with some dissection of the language they're using in their documents, along with some probability work to show the 1 in a trillion claim seems more than reasonable.

sog1927 · Aug 14, 2021

hans1972 said:
The algorithms are not looking for naked people at all. The algorithms in the photo app would be much better for that.

The NeuralHash had two design goals:

1. Finding images which are copies (or derivates) of images in the NCMEC database
2. Be extremely good at not finding images not in this database

It's #2 which makes this system so inherently bad at finding "people who protests", "gay people", "people with guns", "innocent naked picture of my children" etc.

If you, in the general sense, create your own child pornography, NeuralHash shouldn't find it. I believe also, even if you use the same children, but create new imagery, NeuralHash will not catch it.

Which is why this "feature" encourages the production of new child pornography and the exploitation of additional children by attempting to block the distribution of existing images. New images won't be in the hash DB and can be freely stored and distributed until the DB is updated to incorporate them, so they will be in high demand. This sounds counterproductive to me.

usagora · Aug 14, 2021

sog1927 said:
I have to ask how you know this to be true. What's your sample size? Exactly how many people "into child porn" do you normally associate with?

As I said in another comment, people who are into ANY kind of porn normally are addicted to it. Addicts by definition have little or no self-control. Read pretty much any news story about someone caught with child porn, and it will rarely say, "Police searched his/her home and found 10 images of child porn on the computer." It's normally at MINIMUM in the hundreds, and very often thousands. You don't have to associate with these people to know this.

januarydrive7 · Aug 14, 2021

sog1927 said:
And again, that's a policy decision not a strict technical limitation. I'm reasonably sure that things will stay this way in the US. Other places, not so much.

You've mentioned several times how you think it will be different in other countries, based on things like China requiring Chinese citizens cloud data be stored locally, or Russia required Apple to pre-load some non-Apple apps onto phones sold in Russia. The problem with the argument is that these features are part of the OS, not of an off-device storage (China) or additional non-OS applications (Russia). The fact is, Apple ships a single OS worldwide.

usagora · Aug 14, 2021

sog1927 said:
I don't have access to their code either, but I'm a software engineer with over 4 decades of experience. I also know corporate gobbledygook when I see it.

Take a number and line up with the rest of the self-proclaimed experts. I take all such claims with a grain of salt.

januarydrive7 · Aug 14, 2021

sog1927 said:
Which is why this "feature" encourages the production of new child pornography and the exploitation of additional children by attempting to block the distribution of existing images. New images won't be in the hash DB and can be freely stored and distributed until the DB is updated to incorporate them, so they will be in high demand. This sounds counterproductive to me.

I keep seeing this argument, as well. I don't think this encourages the production of new child porn, but rather decreases the likelihood that pedos will be using iCloud in any way.

CriticalThoughtDrop · Aug 14, 2021

ikir said:
It is a match of hash, not a scan, from a child porn database from NCMEC. No way this activate other way that your a pedophile.

It’s a match of a perceptual hash, not a cryptographic hash. These are very different things despite both being hashes. Perceptual hashes are functionally different, but in general lack two critical things you’re likely familiar with from E2EE and cryptocurrency (both use cryptographic hashes):

1) perceptual hashes are not exact matches, but can be used to match images that are merely similar.

2) they are reversible. That is, you can use the hash backwards, with the engine, to generate content. Think, AI DeepDreaming.

Apple needs so many matches because the on device hashing is limited and not usually very good. Being sketchy about how many isn’t because criminals will only keep 29, but rather researchers have almost enough info to calculate it’s accuracy. Quick math says it’s not great. I think we’ll see some real estimates next week.

sog1927 · Aug 14, 2021

hasanahmad said:
How do you think it works on Flickr, Dropbox? Gmail, amazon photos. google photos, yahoo mail?

where was this level of server of "danger" then? The scan on device is the same as iCloud

I don't use any of those services.

sog1927 · Aug 14, 2021

JCCL said:
Didn't note that, but definitely wouldn't support that either, unless there is an option you can fully disable it.

A can of spray paint should suffice.

sog1927 · Aug 14, 2021

Ritte said:
How many of you saves pictures from the internet in the photos app? Because this is what Apple is assuming. I’m generally curious if this is such a common behavior.

People could be saving photos on the iphone or icloud drive. My photos app only have pictures from my camera and screenshots.

Since Apple only matches against known CSAM pictures, they will not catch new CSAM-material from camera.
An half-intelligent peddo could create a folder on iCloud drive called “kiddie porn” and save his/her material there and would circumvent this implementation.

I wonder if Apple really thought this through. It’s so easy to go around. Either don’t save the photos app (which is not default for non-camera pictures so no biggie) or disable icloud photos. How are they really protecting the children?

And they would still have CSAM in iCloud by peddos not utilizing iCloud photos but rather iCloud Drive. Well done Apple.

There are some applications that will do save to Photos automatically if misconfigured (e.g. Telegram - which has a setting to automatically store images from chats or conversations in Photos).

sog1927 · Aug 14, 2021

forrie said:
I'm wondering about a possible DDoS of this disservice. For example, if this mysterious CSAM database were leaked, along with the algorithms it uses, people could craft images en masse that generate signature matches, using splotchy patterns and colors. That would be more than even Apple could handle.

Exactly. Or just spread actual images on various chat apps knowing that a significant fraction of naive users will have automatic save to Photos enabled. The possibilities for mischief are endless.

sog1927 · Aug 14, 2021

hans1972 said:
My speculation: They are felling pressure from numerous places that they aren't doing enough to fight child pornography and other crimes and helping law enforcements with evidence gathering.

They would probably be pressured to add scanning to iCloud and they don't want to break encryption to do it.

It's also a possibility in meetings with NCMEC they in fact also really feal they should do something.

Or they thought that "protecting children" would be good PR.

CriticalThoughtDrop · Aug 14, 2021

januarydrive7 said:
Nowhere do they say its hamming distance, which would be a pretty poor implementation of this -- they're likely doing fuzzy hashing on the embedded vectors.

I posted in another thread here with some dissection of the language they're using in their documents, along with some probability work to show the 1 in a trillion claim seems more than reasonable.

Hamming distance is one of the more common implementations for edge perceptual hashing. It’s is a dumb way to do it, but it’s simple and efficient. The real problem is, Apple isn’t saying and you’re just guessing based on words they’ve altered and tweaked over the last week. We just have to trust them. I’d love to see the real implementation layer out and verified merely for edification.

In any case, needing 30 matches, plus a secondary on server algorithm to hit their 1 in a trillion false positive target is not confidence inspiring.

Setting even that aside, this is still spyware as far as the security researchers are concerned and I agree. Regardless of how it’s done, fancy spyware is still spyware.

IG88 · Aug 14, 2021

usagora said:
people who are into ANY kind of porn normally are addicted to it.

What makes you an expert on that? Are you addicted? Are people that are into any kind of alcohol, gambling, & other vices normally addicted to those as well?

sog1927 · Aug 14, 2021

CriticalThoughtDrop said:
The silliness of these arguments you replied too, is mind boggling.

My device needs my encryption keys to function is not at all similar to Apple giving my encryption keys to law enforcement or any third part including Apple itself.

Scanning my photos or indexing my files is not at all similar to Apple giving them to law enforcement or any third party, etc.

Keep up the good fight. This is probably the most important privacy fight of this decade.

That was my point.

sog1927 · Aug 14, 2021

usagora said:
Take a number and line up with the rest of the self-proclaimed experts. I take all such claims with a grain of salt.

Give me an email and I'll send you a resume.

usagora · Aug 14, 2021

IG88 said:
What makes you an expert on that? Are you addicted? Are people that are into any kind of alcohol, gambling, & other vices normally addicted to those as well?

I never claimed expertise. I'm simply going by what I've read. Again, find me a story about someone caught with child porn that only had a handful of images. If they did, then they likely just started collecting it right as they got caught. I think it must be like cocaine or heroin to these people - FAR more addictive than alcohol or gambling (btw, do you think I need to be a cocaine or heroin addict to know how addictive those drugs are? Stop being disingenuous, please). The point is, 30 images of CSAM as a threshold is a MORE than safe number.

januarydrive7 · Aug 14, 2021

CriticalThoughtDrop said:
Hamming distance is one of the more common implementations for edge perceptual hashing. It’s is a dumb way to do it, but it’s simple and efficient. The real problem is, Apple isn’t saying and you’re just guessing based on words they’ve altered and tweaked over the last week. We just have to trust them. I’d love to see the real implementation layer out and verified merely for edification.

They've never claimed to do edge perceptual hashing, though. In fact, they state that they're doing a novel thing to embed both perception and semantic meaning. If you read the post I linked in my last reply, I've outlined a fairly simple approach that would do this --- you're right that it's a guess, but I'm inclined to assume that the paper is mostly honest in terms of implementation, rather than assuming that what they're doing is something entirely different. I agree it would be great to see what their deep NN looks like.

CriticalThoughtDrop said:
In any case, needing 30 matches, plus a secondary on server algorithm to hit their 1 in a trillion false positive target is not confidence inspiring.

I agree -- I mentioned something in that other thread about this as well, and I see it as most likely a big undersell of what the actual false positive rate is. Similar to how they undersell much of the capabilities with their hardware ("only supposed to get x hours but I was able to >>x hours of battery", etc.). Again, we don't know because we can't see it.

CriticalThoughtDrop said:
Setting even that aside, this is still spyware as far as the security researchers are concerned and I agree. Regardless of how it’s done, fancy spyware is still spyware.

This is patently false. Spyware is something that is installed without a user knowing. They are making it plain that they are doing this, and bending over backward to give as much detail as possible without letting you fork their Git repo.

usagora · Aug 14, 2021

sog1927 said:
Give me an email and I'll send you a resume.

LOL! Yeah, I'm sure you'll send me, a random stranger, all your personal info and references. Nice rhetoric, though. Look, I don't CARE what you've done or for how many years you've done it. You're not an expert at what's happening at Apple nor do you know all the nitty gritty details of the technology they're implementing. So you better get your facts straight before you start libeling people or companies online like many on this forum are. It's one thing to say you're uncomfortable with what Apple is doing or you wish you knew more details. It's quite another to accuse them of lying or other wrongdoing without any evidence to support those claims.

milescortez · Aug 14, 2021

hasanahmad said:
yet here you are on macrumors with Apple staying rent free in your head

Sort of. Spent the day migrating everything off of MacOS because I’m leaving Apple for good. All of it. 6 iPhones 3 Mac mini’s. 4 MacBooks 4 Apple TV’s and 2 Apple displays. Family of 6 leaving for good. It’s liberating.

milescortez · Aug 14, 2021

one more said:
What will you use instead? For technically savvy, Linux could work for a computer, for tablets and phones, however, it is pretty much iOS, Android or nothing. Where Google-driven Android is not exactly a private garden either. 🤷🏻‍♂️

Yeah not ideal—-I’ve looked into high security phones. For now I bought two 10TB ironwolfs and a NAS to keep everything local. Running windows 10. It just runs way better than macOS which has started to suck. I’m going to host my own email locally too. No cloud at all. I browse with VPN on Firefox. I may ultimately go Linux but for now I’m just dumping all things Apple amd I have a beefed up intel nuc which far outperforms my Mac mini. This departure has been a long time coming. The latest invasion of privacy was just the final tipping point for me.

milescortez · Aug 14, 2021

sudo-sandwich said:
No, it's because their stuff works well.

Lmfao.

Bluetoot- · Aug 14, 2021

Lots of nefarious things are technically possible with our phones. Until those things start actually happening, I’ll keep using my phone.

macfacts · Aug 14, 2021

Google wants to show you an ad by scanning your email, apple wants to send you to jail by scanning your pics.

msp3 · Aug 14, 2021

Why all the mental gymnastics about how spying on people's photos is good, and you even bring in different hack groups (let's be clear, most of these "charities" are out to make money for themselves and do the bidding of the highest bidder) to make it sound more convincing?

What is so hard about not spying on your customers? Just. Leave. Us. Alone.

Apple Outlines Security and Privacy of CSAM Detection System in New Document

macrumors 6502a

macrumors member

macrumors 601

macrumors 6502a

macrumors 601

macrumors 6502a

Suspended

macrumors member

macrumors member

macrumors member

macrumors member

macrumors member

Suspended

macrumors 65816

macrumors member

macrumors member

macrumors 601

macrumors 6502a

macrumors 601

macrumors regular

macrumors regular

macrumors regular

macrumors 6502

macrumors 603

Suspended

Our Staff