Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
AI generative image models are trained on billions of images, and the weights in the models reflect it. One of the biggest problem is the art datasets, used in training. The human structure in a painting is far from perfect. There are other technical reasons too.

The models have problems with things that can be in innumerable positions. Shoe laces and hand positions can be in many positions. They cannot be predicted so errors occur in the output. It's easy predict where arms are when they hang at the side of the body, but ask these image generators to produce images of people walking on their hands or waving their arms in the air and the results are super bad.

But they also have problems with compositionally. Try this simple string of coloured primitives in any order and the output is guaranteed not to be satisfactory:

'A yellow triangle inside a green square inside a red circle inside a blue hexagon'

'A yellow pyramid on top of a red ball on top of a green cube'

Or any combination of the above. A child could draw these in a few minutes, but these systems using hundreds of GPUs struggle badly.
 
And it's not just that... it's also the willingness of some companies to throw R&D projects into the public spotlight. Google does this all the time. Apple does not, so it will always seem like they're behind or not even working on something.

The big problem that will make AI hypers swallow their words and facepalm themselves for being so enthusiastic will be:

Theocrats training systems purely on ancient and medieval literature to enforce ancient and medieval laws.

Then all those grifters hyping AI will have egg on their faces and lose all their friends.
 
The models have problems with things that can be in innumerable positions. Shoe laces and hand positions can be in many positions. They cannot be predicted so errors occur in the output. It's easy predict where arms are when they hang at the side of the body, but ask these image generators to produce images of people walking on their hands or waving their arms in the air and the results are super bad.

But they also have problems with compositionally. Try this simple string of coloured primitives in any order and the output is guaranteed not to be satisfactory:

'A yellow triangle inside a green square inside a red circle inside a blue hexagon'

'A yellow pyramid on top of a red ball on top of a green cube'

Or any combination of the above. A child could draw these in a few minutes, but these systems using hundreds of GPUs struggle badly.
Depends on what you are using. Stable diffusion, dream booth and other open source generative models have all the problems. Midjourney has come a long way from V3 to V4. If Apple had to buy, just grab Midjourney.
Pix2pix, controlnet are recent additions, which have enhanced the generative Ai.
Use custom trained hyper networks or trained models on top of default ones, it becomes lot better. They are GANs which fix some of these issue. Another option is to generate image on lower resolution, then upscale it automatically. The lowers gen plus upscaling is great way.
Another reason for custom models is a studio with certain set of art style, doesn’t care about rest, and they want consistent style.
 
And it's not just that... it's also the willingness of some companies to throw R&D projects into the public spotlight. Google does this all the time. Apple does not, so it will always seem like they're behind or not even working on something.
Google abandons pretty much everything. And the other problem Google has in generative AI is the datasets used for training from commercial images. Obviously, any one who had used good image models know how bad the quality is, gotta generate hundreds of images to get few good ones.
Apple doesn’t do beta testing on a nascent technology.
 
Depends on what you are using. Stable diffusion, dream booth and other open source generative models have all the problems. Midjourney has come a long way from V3 to V4. If Apple had to buy, just grab Midjourney.
Pix2pix, controlnet are recent additions, which have enhanced the generative Ai.
Use custom trained hyper networks or trained models on top of default ones, it becomes lot better. They are GANs which fix some of these issue. Another option is to generate image on lower resolution, then upscale it automatically. The lowers gen plus upscaling is great way.
Another reason for custom models is a studio with certain set of art style, doesn’t care about rest, and they want consistent style.

There's no point in buying what can be built in house. There's no secret how these things are all built.

There is a major problem people have not been speaking about because this is Anglo-centric. Everyone talking from an Anglo-centric point of view.

We live in a world with hundreds of languages, but these models are based on training with English.

To train the models, and these so called 'AI search engines', on all languages would consume an ungodly amount of compute resources, especially the stuff like search engines which would have to be fed news articles all day long.

That creates a kind of arms race between the English language models and Chinese language models, because the US and China have the money to do it, with all other languages only getting left over compute power. China of course is being locked out of obtaining chips and fabs as part of this arms race.

The poorer your country is the less likely the models will be trained with your language in mind.

It's a very classist and elitist use of compute resources.

The internet is supposed to be about equality, but tech bros are making it more and more elitist. They appear to believe the richest people should have the most powerful tools and most richest web experiences.
 
  • Wow
Reactions: gusmula
There's no point in buying what can be built in house. There's no secret how these things are all built.

There is a major problem people have not been speaking about because this is Anglo-centric. Everyone talking from an Anglo-centric point of view.

We live in a world with hundreds of languages, but these models are based on training with English.

To train the models, and these so called 'AI search engines', on all languages would consume an ungodly amount of compute resources, especially the stuff like search engines which would have to be fed news articles all day long.

That creates a kind of arms race between the English language models and Chinese language models, because the US and China have the money to do it, with all other languages only getting left over compute power. China of course is being locked out of obtaining chips and fabs as part of this arms race.

The poorer your country is the less likely the models will be trained with your language in mind.

It's a very classist and elitist use of compute resources.

The internet is supposed to be about equality, but tech bros are making it more and more elitist. They appear to believe the richest people should have the most powerful tools and most richest web experiences.
There isn’t Italian, Chinese or French or some localized language vision/image models. The CLIP makes it easier, you can have same trained vision encoder, with additional BERT training step for a localized language. Chat GPT is different because it was trained in English, but nothing preventing using a translation after the fact.
 
There isn’t Italian, Chinese or French or some localized language vision/image models. The CLIP makes it easier, you can have same trained vision encoder, with additional BERT training step for a localized language. Chat GPT is different because it was trained in English, but nothing preventing using a translation after the fact.

ChatGPT does support some extra languages, though oddly it is incorrect when I ask it if it supports other languages. It says it doesn't, and then I speak to it in another language and it responds in that language. So crazy.

That raises the extra problem. ChatGPT and other LLMs are often wrong about many things and often contradict themselves and even apologize to you when you point it out. Being wrong and then maybe inaccurately translating itself to another language could mean two layers of errors.

Then there's the ugliness of AI generated spam.

Scam and spam ads using cringy tacky weird images and messed up text are already appearing everywhere, because now spammers don't need to have a real product and hire real models. A few clicks is all they need and Google will be happy to let them buy ad space because Google is an irresponsible organization who don't give AF.



1676834835876.jpeg
 
Last edited:
Apple's profits come primarily from hardware, but they do use software to sufficiently differentiate their hardware and sell them at a handsome margin, with services coming in to further sweeten the deal.

I will argue that Apple sells experiences, made possible by their control over hardware, software and services, to the point where it's difficult to tell where one ends and where the other begins. And that's what people pay for. An integrated computing experience that just works out of the box.

Perhaps this is why nobody has been able to disrupt Apple, despite offering products that may seem superior to their hardware or software on paper, because Apple is competing in neither directly. Nobody has been able to replicate the unique Apple experience (much less surpass it), because to do so, they need to be able to reproduce the entire Apple ecosystem in its entirety. It was too much for Microsoft and even Google, and it will be too much for any new entrant in the market today.

Agreed. I am of the opinion that a lot of this initial enthusiasm regarding AI chatbots deserve to have a great deal of cold water poured on it, not least because I feel that there is a complete lack of design and human thinking regarding language models being used to push information to us. In short, I feel that Google and Microsoft are going about it the wrong way, in that they are making the classic error of putting too much emphasis on the underlying tech and not enough on the use case (which is the same problem facing folding phones and why they have yet to take off).

I can't quite put my finger on it yet, but I think that as more time goes on, we will start to see more holes in this whole AI narrative.
On your points I agree - most critically Apple sells an experience both initially and continually using any of their products and services.

I still standby that they’re a software co 1st that uses their carefully created hardware to unless and envelope the user into a great experience. Software iTunes and Safari for Windows was really incredible and each had VERY minimal bugs even by todays standards - I can still install iTunes from almost 6yrs ago on Windows 11 (3 OS past it’s original release date) and it’ll sing!

Regarding the AI I appreciate your thoughts. Quite interesting. Maybe Apple is liking beyond the chat bots as a poor experience and liking for a more holistic experience amongst their products and services. I’m just livid cause the hardware has been there and their Assistant had a leap frog head start that was squandered. They had years before John joined to rebuild it from scratch; hopefully something insanely great debuts soon enough.
 
AI generative image models are trained on billions of images, and the weights in the models reflect it. One of the biggest problem is the art datasets, used in training. The human structure in a painting is far from perfect. There are other technical reasons too.
Just like any other evolving technology, it gets better every day. Like I said most of AI is not in infancy, but it will disrupt the way things are done.
AI doesn’t have to be perfect, it’s another tool that can have great benefits as it evolves.

Ahh. That’s how I’ve seen in a Netflix special about AI being dangerous sue TJ the image sets provided by the coders. The AI can be dangerous or highly limited since the majority of coders across the globes does NOT represent the vastness of people in the world.

One example of such an AI for law enforcement was based on image set. Images of a party provided only Caucasian persons gathering together with smiles at a party. Yet not other cultures party images of the same was provided so the AI was flawed in that perspective. I’m not sure how advance or adaptive AI is where it can infer specifics and make big enough adjustments to determine people of Jan 6 event last year vs people in a protest ;)
 
I just hope those channels where AI is in it's most hyperbolic state at the moment is treated much, much better than that of NFTs this time around.
 
I find these AI chatbots unsettling and some of their responses downright creepy. I’m fine with Apple staying away from creating one.
In time it could massively improve Siri. The 'chatbot' form as it is now is silly. Look at Coda to see where Apple could integrate AI in apps if they were a software company.
 
  • Like
Reactions: Tagbert
Depends on what you are using. Stable diffusion, dream booth and other open source generative models have all the problems. Midjourney has come a long way from V3 to V4. If Apple had to buy, just grab Midjourney.
Pix2pix, controlnet are recent additions, which have enhanced the generative Ai.
Use custom trained hyper networks or trained models on top of default ones, it becomes lot better.

With limits. They cannot be trained to always get better. Stephen Wolfram wrote a great article about this and applies to language models and image generators.


Or put another way, there’s an ultimate tradeoff between capability and trainability: the more you want a system to make “true use” of its computational capabilities, the more it’s going to show computational irreducibility, and the less it’s going to be trainable. And the more it’s fundamentally trainable, the less it’s going to be able to do sophisticated computation.

The only people who are saying machine learning will ultimately be trained to be able to do anything and have seemingly unlimited powers are non-engineers and people like Altman who are hoping investors throw billions of dollars at them. It’s a big grift that solves far less problems than they say it can and creates dozens of new problems.
 
Ross Douthat, writing for the New York Times, warns that "aside minor questions like whether rogue A.I. might wipe out the human race", AI carries other risks that include: "this kind of creation would inevitably be perceived as a person by most users, even if it wasn’t one", and "a place where an entire civilization could easily get lost."

So between OpenAI's ChatGPT, Microsoft's Sydney, Google's Bard, and Apple's souped-up Siri or Siri-enhanced Safari browser, AI may one day have the power to:
1. Terminate us like Skynet
2. Control us like the Matrix
3. Delude us with false truths.

I think that last possibility is the most dangerous.
Yea I think 3 is the gateway to 1 or 2
 
Yea I think 3 is the gateway to 1 or 2

Imagine someone with their head stuck in VR being fed generative images and text all day. This person will become a complete vegetable, easily controlled and never using their own brain or having contact with reality except when they need to open the door for the drone delivered junk food.
 
  • Like
Reactions: compwiz1202
Imagine someone with their head stuck in VR being fed generative images and text all day. This person will become a complete vegetable, easily controlled and never using their own brain or having contact with reality except when they need to open the door for the drone delivered junk food.
Heck if you had one of those fancy locks, you could give the drone the code and it could bring it right inside.
 
several comments here about how this new AI stuff is a fad.
This isn’t a fad at all. This is a disruption and Apple needs a plan. I just heard a college kid the other day talk about how she uses her AI bot all the time. the new AI knows when to use uses emojis like people, it can write papers. It’s kinda scary what it can do, and it’s just in its infancy.
 
Yes maybe a unique search engine in the future that is different than anything now. Imagine not having to need Microsoft or Google search engines any more.
It's not about search, it's about conversation with an chatGPT alike AI-powered Siri and Spotlight instead of the current stupid, infuriating things.
 
  • Like
Reactions: compwiz1202
I'm using AI or Stable Diffusion WebUI on Mac but it's def slower than RTX 30 series.

First of all, Mac is far from 3D and AI stuff and that's where Nvidia dominates. Superior unified memory does not help as GPU itself is so slow. M1 Max for example is slower than laptop RTX 3060. Yes, Apple is making their own machine learning but Nvidia has better hardware and software for a long time. Almost all AI services are Nvidia GPU based.

Apple isn't really leading AI/Machine learning and their GPU isn't really great to use. Beside, they only support a single graphic card so I dont think they will ever do AI/Machine learning unless they make upgradable AS Mac Pro.
1. Your RTX 30 equipped notebook can't work full speed more than 1 hours
2. AI powered Siri in iPhone/iPad and tethered Apple Watch will be much more convenient and thus useful than MacBook or any notebook.
 
  • Disagree
Reactions: sunny5
Regarding the AI I appreciate your thoughts. Quite interesting. Maybe Apple is liking beyond the chat bots as a poor experience and liking for a more holistic experience amongst their products and services. I’m just livid cause the hardware has been there and their Assistant had a leap frog head start that was squandered. They had years before John joined to rebuild it from scratch; hopefully something insanely great debuts soon enough.
I have my reasons for not sharing everyone's optimism here with regards to generative AI (and it's not just because Apple doesn't seem to be getting involved with it), and it goes all the way back to 2016 when people claimed that voice computing (ie: smart speakers powered by Alexa or Google Assistant) would make up a sizeable part of computing devices.

We all know how that turned out, and Apple has not been any worse off for not having a competitor to the Amazon Echo (which famously continues to cost Amazon money).

At the end of the day, people just don't want to have conversations with their computers. Also, I expect a lot of problems to arise as a result of chatGPT use becoming more widespread as a result of Microsoft, and maybe the best play is to sit back and watch the competition implode from the issues piling up.

I am also curious as to how Google plans to monetise their Bard offering, but maybe that's another conversation for another day.
 
At the end of the day, people just don't want to have conversations with their computers.

This is the main point. By the time you have written a bunch of prompts and instructions and then edited and fixed the output, you could have just written the email or article anyway yourself. It would have given your brain a workout and if we don't give our brain a workout it is a simple fact we will become dumber with age. The brain is like a muscle and needs to be maintained.

Speaking to a computer is also a problem. Nobody really wants to talk out loud in front of a computer like in the movies. Nobody around you wants to listen to you speaking to a computer. They will tell users to stop annoying them with noise.

In terms of the creative fields, there are some uses there because machine learning, algos and procedural routines have been used for a long time, but only untalented unskilled unexperienced AI boosters think creatives will be replaced. It's an utterly uninformed idea. The reality is whenever we have new automating tools that increases our efficiency, we just end up with more work on our hands and that means more manual work too.
 
Check with Gurman, I’m sure he’ll share plenty in his next newsletter…
There is nothing to share because it‘s nothing consumer facing anyways.

Apple holds their AI summit every year, it‘s nothing special. This year it was all about ML initiatives, not even a trace or mention of anything that goes into the ChatGPT territory. Don‘t get your hopes up.
 
  • Like
Reactions: DeepIn2U
1. Your RTX 30 equipped notebook can't work full speed more than 1 hours
2. AI powered Siri in iPhone/iPad and tethered Apple Watch will be much more convenient and thus useful than MacBook or any notebook.
Totally false. Apple Silicon Mac is just slower to use. Full speed isn't really matter. I guess you dont even have RTX 30 laptop after all.
 
Running LLM in device for Siri would make it a game changer, but it will take a while until local inference will be possible. Maybe if Apple will come one with a way of using flash for storing the model (chatGPT would likely need 170GB RAM, which Apple just cannot put in a phone anytime soon). Low latency, offline capability, and privacy would differentiate it from the rest.
World moved faster than I anticipated: META today released a LLM that one might run on an iPad Pro (16GB RAM model).

 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.