Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacRumors

macrumors bot
Original poster
Apr 12, 2001
69,794
41,193


Apple researchers have released Pico-Banana-400K, a comprehensive dataset of 400,000 curated images that's been specifically designed to improve how AI systems edit photos based on text prompts.

apple-intelligence-black.jpeg

The massive dataset aims to address what Apple describes as a gap in current AI image editing training. While systems like GPT-4o can make impressive edits, the researchers say progress has been limited by inadequate training data built from real photographs. Apple's new dataset aims to improve the situation.

Pico-Banana-400K features images organized into 35 different edit types across eight categories, from basic adjustments like color changes to complex transformations such as converting people into Pixar-style characters or LEGO figures. Each image went through Apple's AI-powered quality control system, with Google's Gemini-2.5-Pro being used to evaluate the results based on instruction compliance and technical quality.

The dataset also includes three specialized subsets: 258,000 single-edit examples for basic training, 56,000 preference pairs comparing successful and failed edits, and 72,000 multi-turn sequences showing how images evolve through multiple consecutive edits.

Apple built the dataset using Google's Gemini-2.5-Flash-Image (aka Nano-Banana) editing model, which was released just a few months ago. However, Apple's research revealed its limitations. While global style changes succeeded 93% of the time, precise tasks like relocating objects or editing text seriously struggled, with success rates below 60%.

apple-image-editing-ai-dataset-pico-banana.jpg

Despite the limitations, researchers say their aim with Pico-Banana-400K is to establish "a robust foundation for training and benchmarking the next generation of text-guided image editing models." The complete dataset is freely available for non-commercial research use on GitHub, so developers can use it to train more capable image editing AI.

Article Link: Apple's New AI Dataset Aims to Improve Photo Editing Models
 
Anyone who has used any of the AI tools knows none of them are ready for Prime Time. None of them. That is why I always say it's just echo-chamber nonsense to say Apple or anyone is behind in this nacent quickly evolving open source media. That ALL of them have been RUSHED to market without adequate testing goes without saying.

There is immense room for improvement with just the low hanging fruit of better training data. And here we see Apple showing they certainly know that.
 
“…with Google's Gemini-2.5-Pro being used to evaluate the results based on instruction compliance and technical quality.”

Am I reading this wrong, or does this state that Apple is using a google product to evaluate an apple product, making it clear that the google product is the inspiration and standard to which Apple are aspiring?
 
Last edited:
Anyone who has used any of the AI tools knows none of them are ready for Prime Time. None of them. That is why I always say it's just echo-chamber nonsense to say Apple or anyone is behind in this nacent quickly evolving open source media. That ALL of them have been RUSHED to market without adequate testing goes without saying.

There is immense room for improvement with just the low hanging fruit of better training data. And here we see Apple showing they certainly know that.
And as someone who uses AI tools everyday for work, you cannot be more incorrect.
 
The most commonly used feature for editing a photo has to be removing an object, and Photos app does a decent job, sometimes, that's where the effort needs to go. Not adding objects. Not changing the season. Removing objects.
 
  • Like
Reactions: Huck and KeithBN
Why not use photos on user devices for maximum accuracy?

Because models are created from thousands images that need to be curated and cropped first and then the deep learning/training process is too intensive for even a MacBook Pro let alone a phone. Even a Mac Studio M3 Ultra will struggle.
 
And as someone who uses AI tools everyday for work, you cannot be more incorrect.

He is mostly correct. Your standards are just much lower than his if you believe generative models meet the high end requirements of advertising and VFX. You can use them for media production but even at best they do not achieve the high bar of actual photography, actual 3D modelling, actual high end VFX.

If they did achieve that high bar you’d be paying $2000 a month. There’s no way a corporation will give you that level high end models for $200 like Google and Sora charge for the janky models they have now.
 
He is mostly correct. Your standards are just much lower than his if you believe generative models meet the high end requirements of advertising and VFX. You can use them for media production but even at best they do not achieve the high bar of actual photography, actual 3D modelling, actual high end VFX.

If they did achieve that high bar you’d be paying $2000 a month. There’s no way a corporation will give you that level high end models for $200 like Google and Sora charge for the janky models they have now.
That really depends on the context. If OP was talking about media generation or editing, then yeah, there are still issues there. But OP said "any of the AI tools" and there are tons that are already in production use outside of media generation.
 
Because models are created from thousands images that need to be curated and cropped first and then the deep learning/training process is too intensive for even a MacBook Pro let alone a phone. Even a Mac Studio M3 Ultra will struggle.
No I mean why not submit user images (anonymized) to the training data on in-house machines?
 
Anyone who has used any of the AI tools knows none of them are ready for Prime Time. None of them. That is why I always say it's just echo-chamber nonsense to say Apple or anyone is behind in this nacent quickly evolving open source media. That ALL of them have been RUSHED to market without adequate testing goes without saying.

There is immense room for improvement with just the low hanging fruit of better training data. And here we see Apple showing they certainly know that.
100%

All of the LLMs are confidently wrong the whole time.
 
Anyone who has used any of the AI tools knows none of them are ready for Prime Time. None of them. That is why I always say it's just echo-chamber nonsense to say Apple or anyone is behind in this nacent quickly evolving open source media. That ALL of them have been RUSHED to market without adequate testing goes without saying.

There is immense room for improvement with just the low hanging fruit of better training data. And here we see Apple showing they certainly know that.
Of course there’s immense room for improvement, they are not perfect at all. With that being said, AI can replace 50% of the world’s office workforce today without hesitation.
 
It’s pretty clear AI will replace everyone sooner rather than later. I can’t believe there haven’t been massive protests yet.
 
  • Like
Reactions: mjs916
Apple's released tools or the internal AI tools Apple is working with? No one here can answer your question.
Fair enough.
Of course there’s immense room for improvement, they are not perfect at all. With that being said, AI can replace 50% of the world’s office workforce today without hesitation.
I agree. Customer service suffers overall, but it certainly can. It’s ”good enough” at lots of tasks.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.