ChatGPT's Apple Health Integration Flaws Exposed in New Report

CmdrLaForge · Jan 27, 2026

AI tools have their limits, that get's clearer the more you use them. Still useful but you need to understand what you are doing.

elvisimprsntr · Jan 27, 2026

Yet, health insurance companies can accurately use AI to automatically deny approvals or claims with the sole goal of maximizing profits.

Séamus Boyle · Jan 27, 2026

It’s pointless and so is the sleep score BS. Wildly inaccurate and a complete waste of time.

neuropsychguy · Jan 27, 2026

Séamus Boyle said:
It’s pointless and so is the sleep score BS. Wildly inaccurate and a complete waste of time.

Apple's sleep score is far from perfect, but it's not bad. It's also highly helpful within person to track your sleep over time.

Edit: I keep meaning to write up an analysis of the sleep score and offer suggestions to Apple about how to improve it. I'm not a "sleep researcher" but have been on various peer-reviewed articles looking at sleep in different populations.

Ursadorable · Jan 27, 2026

neuropsychguy said:
ChatGPT 5.2 says: "There are 3 letters “r” in strawberry."

How many billions of dollars did it take for them to get ChatGPT to learn to count the number of letters in a word? 😆

sw1tcher · Jan 27, 2026

ChatGPT also kept forgetting basic information about him, including his gender and age, despite it having full access to his records.

That's because it didn't have his TikTok information 😛

Both companies say their health tools aren't meant to replace doctors or provide diagnoses.

Then what's the point of the service if it cannot accurately analyze the data? What use does it serve if it gets things wrong?

d686546s · Jan 27, 2026

OpenAI announced the launch of ChatGPT Health, a dedicated section of ChatGPT where users can ask health-related questions [...]. Both companies say their health tools aren't meant to replace doctors or provide diagnoses.

Then what are they for? More importantly, if they are not intended to provide at least semi-actionable insights, which again requires these to be accurate, why do you need user data to begin with.

neuropsychguy · Jan 27, 2026

Ursadorable said:
How many billions of dollars did it take for them to get ChatGPT to learn to count the number of letters in a word? 😆

A lot, although that’s not the only thing the tool can do.

Some people buy a $3000 MacBook Pro and then mostly just use it for media consumption and light browsing or other tasks. Other people use it to get a lot of computationally intensive work done. The same thing goes for LLMs. Some of us use them to get our work done much more efficiently.

bluecoast · Jan 27, 2026

You'd have to be crazy to link a black box 3rd party LLM to your health data.

LLMs are the future but they are still basically in a 'public alpha' stage, due to the immense cost of developing them i.e. they need subs and soon, adverting.

Oh and that a race is on of course.

Would I use a Health+service powered by Gemini though? Yes, I would because I would trust that Apple has tuned it correctly and that my data is secure (although we have no way of knowing if that's true or not - we'd just have to trust Apple).

Frosties · Jan 27, 2026

ChatGPT thank you for your private data to train our model.

Ursadorable · Jan 27, 2026

neuropsychguy said:
A lot, although that’s not the only thing the tool can do.

Some people buy a $3000 MacBook Pro and then mostly just use it for media consumption and light browsing or other tasks. Other people use it to get a lot of computationally intensive work done. The same thing goes for LLMs. Some of us use them to get our work done much more efficiently.

You misunderstand my cynicism.

I know it's not the only thing it can do, but counting letters in a word seemed to be a massive hurdle for them that it took this long for them to make it work, tells me the I in AI is a lie. It can regurgitate what it's trained on with a fair degree of accuracy, but the lack basic intelligence of being able to identify the numbers of letters in a word tells me it's not very smart.

aloysiusfreeman · Jan 27, 2026

I am absolutely flabbergasted that the titans of an industry that has not released a valuable product is still going at it.

dwsolberg · Jan 27, 2026

Hmm, is a C a bad grade, or does it mean average? With grade inflation, it's hard to know.

Unity451 · Jan 27, 2026

I supposed that's one problem with an LLM relying data by doctors who don't even agree. Heart disease, cholesterol, insulin resistance, inflammation, etc. has probably some of the most passionate, yet conflicting, MD opinions out there.

Jseeker · Jan 27, 2026

tyranne201 said:
Sleep tracking causes more anxiety than good. I stopped using it when I had a good night sleep and it gave me a Regular score just because I went to bed 1 hour later.

Same here. I noticed that using the sleep tracking, especially after they tweaked it for 'greater accuracy', I was getting more anxious from using it. So I turned it off.

Danilamak · Jan 27, 2026

One important thing that makes Google competitive is data. LLM is nothing without data. No company in world have so much data about humanity. So, to compete with Google, you need to harvest data.

asdfjkl; · Jan 27, 2026

Ursadorable said:
How many billions of dollars did it take for them to get ChatGPT to learn to count the number of letters in a word? 😆

It can't. They just trained it on the strawberry question. I just asked ChatGPT the following:

how many of the letter 'r' are in 'cranberry'?

There are 2 letter **“r”**s in “cranberry.” 🍒

Ja50n · Jan 27, 2026

Just wait until OpenAI tacks on enough cost to make a profit. Then you’ll get to pay for useless summaries!

PBG4 Dude · Jan 27, 2026

asdfjkl; said:
It can't. They just trained it on the strawberry question. I just asked ChatGPT the following:

how many of the letter 'r' are in 'cranberry'?

There are 2 letter **“r”**s in “cranberry.” 🍒

This is the problem I see with AI today. If your cranberry question becomes as widely spread as the strawberry question was 2 months ago, models will be programmed to answer this question correctly. It doesn’t seem to be able to answer correctly without human intervention though.

Mousse · Jan 27, 2026

ItchyRat2160 said:
Errr.... NO! Your job, as a regulatory body, is to REGULATE.

They may as well just tell us all how much they've been paid to let this nonsense slide at this point.

This is what happens when you put reality TV personalities and Faux News anchors in charge of government services.

neuropsychguy said:
ChatGPT 5.2 says: "There are 3 letters “r” in strawberry."

Just like how O'Brien can got Winston to see 5 fingers instead of 4, they can manipulate AI to include as many "r's" in strawberry as they want. You won't get the truth using AI; you'll get their truth.

Until we can make an AI that learns through experience--like we do--and not by being feed reams of data, then AI might be useful. The quality of AI we have now is reliant on the data it's feed. As the old programming adage goes: Garbage in, garbage out.

TwoBytes · Jan 27, 2026

There's a lot of criticism, but I'd like to see what the actual data that was fed into ChatGPT.
Garbage in = garbage out in most cases.

If the data had poor sleep scores (we all know how reliable they are) or years of missing data because someone upgraded devices or the goalposts changed on what counts as good data and bad data, then this could well have thrown off ChatGPT. EG If that dump included data that the person only slept 2h a night for 5 years, all of that has an impact.

Anyone who works with data knows it has to be cleaned and refined before processing.

Would I still like to see a real doctor? Yes.
At the same time, we're allgoing to use AI to get some sort of baseline and self-assess, and I'm sure those gaps will be closed in time after the beta.

svish · Jan 27, 2026

Not surprising. AI has got to improve a lot. Also privacy issues are also present. Think no one should use AI to get crucial health related information. Maybe in the future, it might be better than how it is today.

michaelsaxon · Jan 27, 2026

My experience so far is that ChatGPT sucks. Grok is far better.

MITITMAN2026 · Jan 27, 2026

MacRumors said:
A reporter for The Washington Post has put ChatGPT's new optional Apple Health integration feature to the test by feeding it ten years of their Apple Watch data. The results were not encouraging, to say the least.

Earlier this month, OpenAI announced the launch of ChatGPT Health, a dedicated section of ChatGPT where users can ask health-related questions completely separated from their main ChatGPT experience. For more personalized responses, users can connect various health data services such as Apple Health, Function, MyFitnessPal, Weight Watchers, AllTrails, Instacart, and Peloton.

ChatGPT Health can also integrate with your medical records, allowing it to analyze your lab results and other aspects of your medical history to inform its answers to your health-related questions.

With this in mind, reporter Geoffrey Fowler gave ChatGPT Health access to 29 million steps and 6 million heartbeat measurements from his Apple Health app, and asked the bot to grade his cardiac health. It gave him an F.

Feeling understandably alarmed, Fowler asked his actual doctor, who in no uncertain terms dismissed the AI's assessment entirely. His physician said Fowler was at such low risk for heart problems that his insurance likely wouldn't even cover additional testing to disprove the chatbot's findings.

Cardiologist Eric Topol of the Scripps Research Institute was likewise unimpressed with the large language model's assessment. He called ChatGPT's analysis "baseless" and said people should ignore its medical advice, as it's clearly not ready for prime time.

Perhaps the most troubling finding, though, was ChatGPT's inconsistency. When Fowler asked the same question several times, his score swung wildly between an F and a B. ChatGPT also kept forgetting basic information about him, including his gender and age, despite it having full access to his records.

Anthropic's Claude chatbot fared slightly better – though not by much. The LLM graded Fowler's cardiac health a C, but it also failed to properly account for limitations in the Apple Watch data.

Both companies say their health tools aren't meant to replace doctors or provide diagnoses. Topol rightly argued that if these bots can't accurately assess health data, then they shouldn't be offering grades at all.

Yet nothing appears to be stopping them. The U.S. Food and Drug Administration earlier this month said the agency's job is to "get out of the way as a regulator" to promote innovation. An agency commissioner drew a red line at AI making "medical or clinical claims" without FDA review, but ChatGPT and Claude argue they are just providing information.

"People that do this are going to get really spooked about their health," Topol said. "It could also go the other way and give people who are unhealthy a false sense that everything they're doing is great."

ChatGPT's Apple Health integration is currently limited to a group of beta users. Responding to the report, OpenAI said it was working to improve the consistency of the chatbot's responses. "Launching ChatGPT Health with waitlisted access allows us to learn and improve the experience before making it widely available,” OpenAI VP Ashley Alexander told the publication in a statement.

Article Link: ChatGPT's Apple Health Integration Flaws Exposed in New Report

I've used Grok several times to review test results and treatment recommendations from my oncologist and others. In virtually every case, Grok delivered a detailed, fully customized summary of my reports in plain English. I wouldn't make treatment decisions based on AI just yet, but Grok has been a great tool to help me understand the medical jargon I've had to review. Like any AI, though, it's "buyer beware" regarding its feedback.

ChromeAce · Jan 27, 2026

Oh yes bring us more articles about how software is failing during beta testing.

Also, the guy who got an F against the opinion of his doctor and insurer? Watch for the headlines that he dies of a heart attack unexpectedly. I’d put the combined wisdom of a chatbot up against the average capabilities of a typical internist any day.

ChatGPT's Apple Health Integration Flaws Exposed in New Report

macrumors 601

macrumors 65816

macrumors regular

macrumors 68040

macrumors 65816

macrumors 604

macrumors 6502a

macrumors 68040

macrumors 68040

macrumors 65816

macrumors 65816

Suspended

macrumors 6502a

macrumors 65816

macrumors newbie

macrumors regular

macrumors 6502

macrumors regular

macrumors 601

macrumors 601

macrumors 68040

macrumors P6

macrumors 6502

macrumors newbie

macrumors 6502a

Our Staff