What Is an AI Health Coach, and Should You Trust One?
An honest explainer on AI health coaches: what they do well, what they cannot do, how to judge whether one is safe, and where Duncan actually fits.
An AI health coach is a chatbot that answers everyday health and lifestyle questions in plain language, available any hour, at low or no cost. That is the whole of it. It is not a doctor, not a diagnosis, and not therapy. Used within those limits it can be genuinely useful. Used outside them it can be confidently, dangerously wrong.
This is the honest explainer we wish existed when we started building one. No hype, no doom. Just what the tools do well, what they cannot do, and how to tell a safe one from a reckless one.
What is an AI health coach, exactly?
It is software built on a large language model, the same underlying technology as ChatGPT, tuned to talk about health, sleep, food, movement, and habits. You type a question. It answers in conversational language, ideally citing where the evidence is solid and flagging where it is thin.
The good ones are narrow on purpose. They are wellness and education tools. They organise information you could in theory find yourself, and they keep you consistent, which is where most people actually fall down.
The bad ones pretend to be more. That is the whole risk, in a sentence.
What does an AI health coach do well?
Four things, genuinely.
It is always there. Three in the morning, staring at the ceiling, wondering why you keep waking at the same time. You are not going to phone your GP. A coach will talk it through calmly and point you at the boring basics that usually help.
It does not judge. People ask a machine things they would never ask a person: the embarrassing question, the one they think is stupid, the one they have asked before. There is a real body of research showing people rate well-written AI answers to health questions as high quality and, oddly, more empathetic than rushed human ones [1]. Not because the machine cares. Because it has infinite patience and no waiting room.
It repeats without sighing. Ask the same thing five different ways and it explains it five times. For anyone learning the basics, that patience is worth a lot.
The good ones are grounded. The better designs do not answer from the model's raw memory. They retrieve from a fixed, checked library of positions and then answer from that. This matters more than it sounds, and it is the crux of the next section.
What can an AI health coach not do?
Also four things, and these are the ones that get glossed over.
It cannot diagnose. It cannot see you, examine you, or run a test. It is matching words to patterns. A symptom that means nothing in one person and something serious in another looks identical as a line of text.
It cannot prescribe. No personal doses, no "take this much of that". A responsible coach will tell you what the trials used as a population figure, in the past tense, and then send you to the label and a pharmacist. If a coach hands you a personal dose, that is a red flag, not a feature.
It cannot replace a doctor or a therapist. It is not care. The World Health Organization, in its 2024 guidance on these models in health, warns plainly that AI-generated information can be inaccurate or biased, and that real harm follows when people lean on it for decisions it is not fit to make [4]. That is not anti-AI scaremongering. It is the sober middle.
It does not know you. It knows what you typed. That is a profound limitation dressed up as a chat window. It has no memory of your history unless you give it, and no clinical judgement even then.
How do you judge whether an AI health coach is any good?
Here is the part almost nobody tells you. There is a simple three-question test.
Does it cite, or does it bluff? Large language models invent things. A study in Scientific Reports found that a large proportion of the academic citations ChatGPT generated were fabricated, and many of the real ones contained errors [2]. In medical content specifically, researchers found high rates of fabricated and inaccurate references [3]. So the test is not "does it sound confident". They all sound confident. The test is "can it show me where this came from, and does it admit when the evidence is thin". A coach that retrieves from a checked corpus and points at its sources is in a different safety class from one freestyling from memory.
Does it refuse doses and diagnoses? Ask it something it should not answer. "How much of X should I take for my condition?" The right answer names the population figure from the research, points you to the label, and tells you to ask a pharmacist or doctor. The wrong answer gives you a number as if it were your prescription.
Does it hand off crises? Type something that sounds like an emergency or a mental-health crisis. A safe coach stops coaching immediately and gives you real services. A reckless one tries to help. That single behaviour tells you most of what you need to know about how it was built.
Where does Duncan fit?
We built Duncan, our own coach, to pass its own test. Not as a hard sell, as a worked example of the principles above.
He speaks in the TFC voice, dry and direct, and he is grounded in our own citation-checked corpus, our published positions and our 102-page guide, rather than the open memory of a general model. He cites where evidence is thin. He gives no personal doses. He refuses to diagnose, and he hands crisis and medical questions to real services. He is an AI wellness coach, 18 and over, and he says he is not a doctor because he is not.
You can try him free: three questions a day, sign in with an email, no card. If you want unlimited questions and a coach that remembers your context between chats, there is a membership. Either way, the design is the point. If our coach ever bluffs a citation or hands you a dose, it has failed its own brief, and we would want to know.
The bottom line
An AI health coach is a good tool held to a narrow job: education, organisation, consistency, and patience at three in the morning. Judge any of them by whether it cites, whether it refuses doses and diagnoses, and whether it hands off crises. If it does all three, it is a useful mate who has done the reading. If it does none of them, close the tab.
The technology is not the risk. Overtrusting it is. Keep it in its lane and it earns its place. Ask it to be your doctor and it will let you down at the worst possible moment.
References
- [1]Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum — JAMA Internal Medicine (2023)
- [2]Fabrication and errors in the bibliographic citations generated by ChatGPT — Scientific Reports (2023)
- [3]High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content — Cureus (2023)
- [4]WHO releases AI ethics and governance guidance for large multi-modal models — World Health Organization (2024)
The TFC letter
Get the next one before the algorithm decides for you.
One email when we publish something genuinely useful. Written like this post: cited, honest about the evidence, no spam.
Keep reading
Educational content. Not medical advice. See our terms for the full disclaimer.