Apple Seeking Deal to License Billions of Images to Train AI Models

Fuzzball84 · Apr 8, 2024

Surely they train the AI with porn too? I wonder what deals they have with porn studios 🤔🤭

HiVolt · Apr 8, 2024

And this is why I've never uploaded a single picture to any "cloud" or "online album" service.

Fuzzball84 · Apr 8, 2024

HiVolt said:
And this is why I've never uploaded a single picture to any "cloud" or "online album" service.

Yeah, I can imagine... that big soup of images....

Every possible situation, act, event....

Its an AI gold mine

Fuzzball84 · Apr 8, 2024

I wonder if any AI has been trained on the human centiPad.

The AI will be like.... what... what is that... 🤢🤮

Octavius8 · Apr 8, 2024

Not even started to train the model????. Ha, Apple AI will be ready by 2023!.
I just want Siri to be useful. Be able to take a calendar appointment on the 1st try!

Born Again · Apr 8, 2024

NufSaid said:
How fun.

Take the stock images that photographers have on these platforms to sell for money to train something that is going to eliminate commercial photography.

Can’t innovate? my ass!”

Does anyone not see that Apple is extremely threatened by AI?

If AI devices figure a way to not really needing an iPhone Apple is screwed.

You have to believe they’re freaking out in Cupertino.

hoodafoo · Apr 8, 2024

Isn't photobucket a CDN to store memes? That would make for some interesting training material! I hate Siri but I can't wait for Siri 2.0/AI/Whatever if that's the case!

Zwhaler · Apr 8, 2024

erikkfi said:
And how much does each Photobucket uploader get per image under this "ethical deal"?

$0.000000

Saturn007 · Apr 8, 2024

Insider info reports that it will be 1,000 times that! 😁

Abazigal · Apr 8, 2024

This does bring up an inconvenient truth about the current state of AI that nobody seems to be addressing.

That it simply isn't profitable. And just like music streaming, it may never be. These companies rely on being able to carry out copyright infringement on a massive scale without paying a single cent to content creators (or their parent companies), and they are still burning through cash at a rapid rate. Imagine if they suddenly had to start paying creators on top of their normal operating costs. I think most of them would just shutter their operations there and then.

I suspect that AI may end up being yet another ecosystem feature. Something that is not financially feasible to do, and which can only be sustained by tech giants like Apple and Google who have the cash to keep propping this up indefinitely.

So at the end of the day, Apple may not even need to be the best. They just need to be of the last few standing.

No5tromo · Apr 8, 2024

Wait, doesn't Apple already have access to billions of images on iCloud?

MacBH928 · Apr 8, 2024

hooptyuber said:
Photobucket is one of the sleaziest, worst companies to ever exist. They locked everyone's accounts and held them hostage for astronomical fees after letting everyone load up their accounts with images for free for a couple of years. They are total dirtbags.

I don't get the business mind behind such tactics "lets screw all customers" , ok then your customers will abandon you. How does that benefit you?

MacBH928 · Apr 8, 2024

If my calculations are correct, thats 16,250 of 4TB harddrive worth of photos

MacBH928 · Apr 8, 2024

Abazigal said:
This does bring up an inconvenient truth about the current state of AI that nobody seems to be addressing.

That it simply isn't profitable. And just like music streaming, it may never be. These companies rely on being able to carry out copyright infringement on a massive scale without paying a single cent to content creators (or their parent companies), and they are still burning through cash at a rapid rate. Imagine if they suddenly had to start paying creators on top of their normal operating costs. I think most of them would just shutter their operations there and then.

I suspect that AI may end up being yet another ecosystem feature. Something that is not financially feasible to do, and which can only be sustained by tech giants like Apple and Google who have the cash to keep propping this up indefinitely.

So at the end of the day, Apple may not even need to be the best. They just need to be of the last few standing.

For me the Q&A AI must be the biggest revolution in the internet since instant online video streaming. That being said, I do not imagine web authors who work hard to put the information or their website would be happy for AI companies to come and scoop that data and charge $10-20 per user without giving them a cent, let alone they are taking away website visits and hence ad revenue. So the more AI companies are profiting, the more website creators lose.

anthogag · Apr 8, 2024

Freely scraping the Web to train your AI means your AI is on a diet of junk food.

sideshowuniqueuser · Apr 8, 2024

NT1440 said:
Wait, they’re not just going to scrape the internet and say to hell with artist’s rights?

You don’t get to be an “AI” leader that way…

😉

Lucky we can all trust Apple, the most ethical company in the universe

Nermal · Apr 8, 2024

Abazigal said:
These companies rely on being able to carry out copyright infringement on a massive scale [...] Imagine if they suddenly had to start paying creators on top of their normal operating costs.

I found it so strange when OpenAI tried to defend their copyright infringement with "we only do it because our business model requires it". It seemed like a very odd thing to say!

Doomtomb · Apr 8, 2024

Photobucket seriously. What is this 2006?

m4mario · Apr 8, 2024

Oh no, Apple trying to do it the right way, again! ChatGPT was trained on text from books and websites that were not licensed for the purpose, making it plagiarism. Authors are not happy with it.

Remember back in 2013-14 when the average smartass was saying Apple was doomed because Apple was concerned about privacy and therefore they can never survive competing with Google's services. The narrative was: "Apple is caring too much for user privacy and collecting user data than anyone else really cares about and therefore they are doomed."

Privacy was a money loosing strategy back then. But now, after Apple busted their behind to somehow pull off an act extraordinaire to make it work financially and technically, the narrative has shifted to: "privacy was always a money making strategy for Apple and they never really cared for user privacy."

Oh the ever changing narrative of the Apple nay-sayer! In the next 5 to 10 years, we will witness the narrative against Apple shift on plagiarism in AI. From "Apple cares too much" to "Apple never cared about it". And when that shifts, we will know Apple has won, again.

SAIRUS · Apr 8, 2024

A * B * C = X
A=number of iCloud photo accounts
B=probable rate of filing a law suit
C=average lawsuit settlement

If X < cost of licensing and training on custom data, we leak the data and use that.
-Apple

ChrisA · Apr 8, 2024

now i see it said:
Hey SIRI: haven’t I seen that image before?

The way the AI engineers prevent that problem is actualy very simple. They called "regularization" which is a taribly non-descriptive term.

What they really do is i=use a VERY "lossy" kind of compression. They collect a trillion "tons" of data and store the parameters in a billion-ton box. At first you think "Well, that means they are saving space." But no. More importantly, the huge database of input data is GONE, GONE GONE, so there is no chance of the AI spitting out exact copies of the input data. The input is been compressed away.

Compression always removes redundancy and saves the "essence" of the input, but not an exact copy. JPG photos and MP3 audio have only about 3:1 or 6:1 compression so the stored data still looks or sounds like the input data. But the LMM (AI) is doing thousands to one compression, so what gets saved are general rules, trends, and ideas.

AI engineers don't really think in terms of compression, they are just searching a grident but really what is happening is a search for a way to keep the most relevant 0.01 percent of the input data.

In fact the AI engineers have a word for it when the parameters are enough to capture all of the input. They call it "over training" and the textbooks are full of ways to prevent that. Mostly this involves randomly throwing some data away.

So you should expect the images to look very much like the ones it was training on but only because it is a composite of many of them. I'd expect only a strong sematic and stylistic similarity but never a copy. Copies should be impossible.

Think of an artist who studied in Europe. His work might be in a style a little like you have seen before and the subjects might be things you have seen before and he would use colors that you have seen before but each work would be new. This is what AI's will do. Don't expect radically new innovations but also don't expect copies.

Pakaku · Apr 8, 2024

It really is time to abandon the Apple ship, isn't it

bollman · Apr 8, 2024

One thing that will be very interesting going forward, is to see just how long it will take for Apple to catch OpenAI.
If it only takes a year, or less (I highly doubt it), OpenAI as a company should be worthless. If you can easily "buy" yourself to the top, what is the worth of an AI company?

SoldOnApple · Apr 8, 2024

erikkfi said:
And how much does each Photobucket uploader get per image under this "ethical deal"?

If not Photobucket, then they would license Canva's massive image library instead. And if not Canva, then somewhere else. There's no shortage of massive image libraries out there, many of them have CC0 licenses too so paying for them is just a formality.

svish · Apr 9, 2024

Good to know about this. Expecting to hear a lot about AI at WWDC

Apple Seeking Deal to License Billions of Images to Train AI Models

macrumors 68040

macrumors 68000

macrumors 68040

macrumors 68040

macrumors 65816

Suspended

macrumors 6502a

macrumors 604

macrumors 68000

Contributor

macrumors 6502

macrumors G3

macrumors G3

macrumors G3

macrumors 68030

macrumors 68040

Moderator

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors G5

macrumors 68040

macrumors 6502a

macrumors 65816

macrumors P6

Our Staff