Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
It’s litteraly sending the photo to google or chatgpt, I don’t see why not every iPhone would be able to do that.
 
  • Like
Reactions: johnrichard123
Fantastic to hear that it is coming to my 15 Pro Max. Should have been available in 18.2 itself. Expecting it to be in 18.5 at least if not in 18.4. No reason for the 15 Pro models to not get this.
 
  • Like
Reactions: dimon2242 and mganu
As the owner of a 15 Pro, I’m happy to hear this — although I’m assuming this is less of a “let’s throw a bone to the 15 Pro owners” and more of a “the adoption rate on this AI stuff is behind target, let’s get more people using Visual Intelligence so we can get more data for training and development.”
Great point!
Most likely the case indeed.
 
Not that I was one of the people that said that, but how do you know it's not a nerfed down model or a much more optimized model that they just finished cooking up? At the time, it probably was limited by hardware.
Because the major visual intelligence feature is not done on device but in the cloud by other services

Goog.png
CGPT.png


Other features like plant and pet ID has been a part of iOS 15 Visual Look Up on iPhone X and up.

ios-15-visual-lookup.jpg


Because if deepseek taught us anything, it shows that we can get higher performance in smaller models. Apple found a way to bring something that wasn't originally planned for 15 Pro.
No, what Deepseek has taught us is that with better ways of training you can get similar performance using less hardware and data to train. Deepseek SOTA models are similar size to what Google, Open AI have. Getting higher performance from similar sized smaller models is not a new phenomenon.
I guess anything that shows APPLE IS BAD AND GREEDY, we'll just beat that drum.
We don't need to defend a multi trillion dollar corporation that was simply withholding features from iPhone 15 Pro users to market them exclusively to iPhone 16 because of the camera button. Now that the 16e model without the button is released they can now bring the feature to iPhone 15 Pro models. Its greed, that is literally the modus operandi for any corporations.
 
Last edited:
  • Like
Reactions: johnrichard123
Because the major visual intelligence feature is not done on device but in the cloud by other services

Many recognition tasks are done on device, offline, then ping the cloud for additional data like from foursquare


Other features like plant and pet ID has been a part of iOS 15 Visual Look Up on iPhone X and up.
You're not understanding the difference between visual lookup and visual intelligence. Visual lookup looks up the category of the object. So if you point it at a plant, visual lookup on device will recognize it's a plant. It then connects to the internet, uploads a picture to identify the plant.

Visual Intelligence on the other hand attempts to identify the plant on device. If it can't, it expands the task to the cloud.



No, what Deepseek has taught us is that with better ways of training you can get similar performance using less hardware and data to train. Deepseek SOTA models are similar size to what Google, Open AI have. Getting higher performance from similar sized smaller models is not a new phenomenon.

You're not understanding. It increased the accuracy of the models of the same size so that when you distill/quantize the models, you get relatively same performance as the bigger non-deepseek models with less computational/memory requirements.
 
Many recognition tasks are done on device, offline, then ping the cloud for additional data like from foursquare



You're not understanding the difference between visual lookup and visual intelligence. Visual lookup looks up the category of the object. So if you point it at a plant, visual lookup on device will recognize it's a plant. It then connects to the internet, uploads a picture to identify the plant.

Visual Intelligence on the other hand attempts to identify the plant on device. If it can't, it expands the task to the cloud.





You're not understanding. It increased the accuracy of the models of the same size so that when you distill/quantize the models, you get relatively same performance as the bigger non-deepseek models with less computational/memory requirements.
There is no difference, reviews have compared visual lookup and intelligence and they get the exact same results, there is no software change
 
There is no difference, reviews have compared visual lookup and intelligence and they get the exact same results, there is no software change
I literally explained the difference. Visual lookup uses a limited predefined models to categorize objects. Visual Intelligence uses a generalized model and expands to cloud when it needs a larger model to identify objects.
 
Last edited:
I literally explained the difference. Visual lookup uses a limited predefined models to categorize objects. Visual Intelligence uses a generalized model and expands to cloud when it needs a larger model to identify objects.
that’s not how it works there is no difference between the models of visual intelligence and lookup it’s only the ability to share the image with ChatGPT and google that is different 🤦
 
that’s not how it works there is no difference between the models of visual intelligence and lookup it’s only the ability to share the image with ChatGPT and google that is different 🤦
🤦‍♂️
For one, Visual Lookup is not a real time model. There's a reason why the plant detection appears many seconds AFTER you've taken the photo.
 
Last edited:
Or Apple found a way to quantize the models with similar performance and is now bringing it back to old devices yet people still complain.
What are you talking about? The models were designed with iPhone 15 Pro in mind already, both the A17 Pro and A18 Pro have the same 16 NPU cores and 35 TOPs. All the on device performance data they provided was based on the models running on iPhone 15 Pro.

The on device model uses low-bit quantization already, a mixed 2-bit and 4-bit, averaging 3.7 bits per weight. Apple's own technical paper says the lowest they can go is 3.5-bit with some loss in quality. Quantization reduces precision for speed and they are already on the lowest they can go while maintaining some good results.

They intentionally marketed the feature only for iPhone 16 Pro to sell the camera button but since they introduced the 16e without the camera button now they want to enable the shortcut button for 15 Pro. It is really that simple.

Many recognition tasks are done on device, offline, then ping the cloud for additional data like from foursquare
Let me address this screenshot you posted. In the video you got this screenshot from, Stephen Robles tells you that visual intelligence does not recognize that location until you move closer to it, which makes them believe it is using gps and map data not visual recognition. The whole video is just him stating how visual intelligence did not trigger anything but chatgpt and google search worked better.

From that same video visual intelligence is "recognizing" CVS and other businesses inside the Kaseya Center which again means its using GPS and not visual recognition. They are relying a databases they have in maps and in the cloud for visual intelligence. The only things i will say works on devices are optical character recognition related functions like text recognition.

2.png


Apple tells you themselves the images iPhones uses to identify objects and places are not stored on device and are only shared with Apple to process what's in view, no doubt using their private cloud compute.

1.png
 
Last edited:
What are you talking about? The models were designed with iPhone 15 Pro in mind already, both the A17 Pro and A18 Pro have the same 16 NPU cores and 35 TOPs. All the on device performance data they provided was based on the models running on iPhone 15 Pro.

100% false. A18 Pro is 15% faster at machine learning tasks than A17 Pro. If it was the same, it wouldn't be faster.


The on device model uses low-bit quantization already, a mixed 2-bit and 4-bit, averaging 3.7 bits per weight.

Show me. Link me to the evidence that talks about the models they used for Visual Intelligence. From my research, Apple has not talked about what quantization methods they used for Apple Visual Intelligence models specifically. It sounds like you're guessing which would just be as correct or wrong as my guess about how they're bringing it to 15 pro but feel free to provide evidence that shows otherwise.

They intentionally marketed the feature only for iPhone 16 Pro to sell the camera button but since they introduced the 16e without the camera button now they want to enable the shortcut button for 15 Pro. It is really that simple.

False. See reasons above.

Let me address this screenshot you posted. In the video you got this screenshot from, Stephen Robles tells you that visual intelligence does not recognize that location until you move closer to it, which makes them believe it is using gps and map data not visual recognition. The whole video is just him stating how visual intelligence did not trigger anything but chatgpt and google search worked better.

You're missing the point. I'm showing you there's no chatGPT logo on there. You argued it "Because the major visual intelligence feature is not done on device".

From that same video visual intelligence is "recognizing" CVS and other businesses inside the Kaseya Center which again means its using GPS and not visual recognition. They are relying a databases they have in maps and in the cloud for visual intelligence. The only things i will say works on devices are optical character recognition related functions like text recognition.

It falls back to GPS+cloud if it can't get visual recognition on device to work, obviously.

View attachment 2488130

Apple tells you themselves the images iPhones uses to identify objects and places are not stored on device and are only shared with Apple to process what's in view, no doubt using their private cloud compute.

View attachment 2488132

Not sure how many times I have to say it. I'll write it in bold. Here's what I said:

"Visual Intelligence on the other hand attempts to identify the plant on device. If it can't, it expands the task to the cloud."
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.