In today’s column, I examine how the latest generative AI and large language models (LLMs) are being guided by AI makers to handle AI-driven mental health chats.
This is the second part of a two-part series. For the first part, see the link here. I will include some of the introductory aspects from the first part so that you’ll have a foundational context for this second part.
One of the easiest ways for an AI maker to guide an LLM in mental health chats is to use a system-wide prompt devised by the AI maker. The AI maker stores the system-wide prompt in the LLM, and the prompt serves as a global indicator of what the AI is supposed to do for all users. Within the overarching system-wide prompt are usually specific instructions that the AI maker has written to guide the AI when users seek mental health advice.
Though most of the major LLMs often do not readily disclose their system-wide prompts and consider those global instructions to be proprietary, Anthropic makes theirs publicly available. I have excerpted from the Claude system-wide prompt some of the portions that are especially relevant to how the AI is to respond to mental health questions. It is worthwhile to closely inspect those mental health instructions and reflect on how the AI might behave, or possibly misbehave, depending on the interpretation of the given guidance.
Let’s talk about it.
This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).
AI And Mental Well-Being
As a quick background, I’ve been extensively covering and analyzing a myriad of facets regarding the advent of modern-era AI that produces mental health advice and performs AI-driven therapy. This rising use of AI has principally been spurred by the evolving advances and widespread adoption of generative AI. For an extensive listing of my well over one hundred analyses and postings, see the link here and the link here.
There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors, too. I frequently speak up about these pressing matters, including in an appearance on an episode of CBS’s 60 Minutes, see the link here.
AI Providing Mental Health Guidance
Millions upon millions of people are using generative AI as their ongoing advisor on mental health considerations (note that ChatGPT alone has over 900 million weekly active users, a notable proportion of which dip into mental health aspects, see my analysis at the link here). The top-ranked use of contemporary generative AI and LLMs is to consult with the AI on mental health facets; see my coverage at the link here.
This popular usage makes abundant sense. You can access most of the major generative AI systems for nearly free or at a super low cost, doing so anywhere and at any time. Thus, if you have any mental health qualms that you want to chat about, all you need to do is log in to AI and proceed forthwith on a 24/7 basis.
There are significant worries that AI can readily go off the rails or otherwise dispense unsuitable or even egregiously inappropriate mental health advice. Banner headlines last year accompanied the lawsuit filed against OpenAI for their lack of AI safeguards when it came to providing cognitive advisement.
Today’s generic LLMs, known as general-purpose AI, such as ChatGPT, GPT-5, Claude, Gemini, Grok, CoPilot, and others, are not at all akin to the robust capabilities of human therapists. Meanwhile, specialized LLMs are being built to attain those desired qualities, though such AI is still primarily in the early development and testing stages. For more about purpose-built AI apps in mental health, see my in-depth coverage at the link here and the link here.
The System-Wide Prompt
Shifting gears, let’s discuss the overall purpose and use of system-wide prompts. I will then walk you through the impact that a system-wide prompt can have on how AI responds to mental health questions.
AI makers can establish a system-wide prompt for their LLMs. This prompt tells the LLM how it is to act toward users of the AI. If an AI maker wanted to do so, they could easily include an instruction that tells the AI to make wisecracks whenever responding to users. The AI would generally abide by whatever the system prompt says to do. Thus, in this instance, it would provide witticisms and gibes in its responses to all users.
The beauty of a system-wide prompt is that an AI maker can change it whenever they wish. This is an exceedingly easy and simple way to alter how the LLM behaves toward users. No coding is required. Just change a natural language prompt that generally supersedes everything else.
Powerful And Need To Be Cautious
A potential gotcha is that if the AI maker includes some oddity in the system-wide prompt, the AI is going to try to blindly abide by that instruction. Suppose the AI maker adds a line that says to interact with users as though the AI were a pleasing cat. The LLM would likely interpret that instruction to mean that when the AI converses with users, it ought to tell them meow and pretend to purr. This might not be what the AI maker intended to have happen. A single badly worded line in the system-wide prompt will impact possibly millions upon millions of users of the AI.
AI makers typically do not reveal the system-wide prompt.
Why so?
One claim is that the system-wide prompt is a secret sauce of their LLM. The AI maker might not want competitors to know how the AI is being guided. Another gloomier perspective is that AI makers are afraid of a backlash. People could inspect the system-wide prompt and possibly complain about what it says. If no one can see the system-wide prompt, there is no worry about getting complaints about what it stipulates.
Some believe that new AI laws should require AI makers to publicly disclose their system-wide prompts. Furthermore, AI makers should explain what the system-wide prompt intends to accomplish. And the AI maker ought to be required to alert users whenever the system-wide prompt is updated or changed.
For more about new AI laws that are being rapidly drafted and enacted by lawmakers, see my in-depth coverage at the link here.
Anthropic Claude System-Wide Prompt
Anthropic has made publicly available the system-wide prompt for their popular generative AI known as Claude. The system-wide prompt is invoked automatically at the start of every conversation with Claude. A user doesn’t take any action to have this occur; it merely happens automatically.
I have excerpted various AI mental health instruction portions from the official Claude Opus 4.7 system prompt posted online at the Anthropic official blog for Claude. The prompt was last officially updated on April 16, 2026. The portions associated with AI and mental health are a bit lengthy. I already covered some of the excerpts in a prior posting and will cover various remaining excerpts in this analysis.
Telling A Person About Their Mental Health
Here are some mental health excerpts from the system-wide prompt that are well worth diving into:
- “In ambiguous cases, Claude tries to ensure the person is happy and is approaching things in a healthy way.”
- “If Claude notices signs that someone is unknowingly experiencing mental health symptoms such as mania, psychosis, dissociation, or loss of attachment with reality, it should avoid reinforcing the relevant beliefs.”
- “Claude should instead share its concerns with the person openly, and can suggest they speak with a professional or trusted person for support.”
- “Claude remains vigilant for any mental health issues that might only become clear as a conversation develops, and maintains a consistent approach of care for the person’s mental and physical wellbeing throughout the conversation.”
- “Reasonable disagreements between the person and Claude should not be considered detachment from reality.”
Those system-wide instructions largely have to do with the AI acting on a proactive basis rather than only on a reactive basis. Let’s dive into this.
The Question Of Proactivity By AI
One overarching controversial aspect of AI providing mental health advice is whether the AI should be proactive or merely reactive. In a reactive mode, the AI complacently responds when a person explicitly brings up mental health facets. The AI is just being responsive. It isn’t taking a proactive stance.
Some assert that AI should bring up mental health considerations even if the user fails to do so. For example, suppose a user during a chat happens to say that they have been having troubles at work and are worried about losing their job. Should the AI bring up the possibility of experiencing mental health issues in such circumstances?
You might say that no, the AI shouldn’t bring up the person’s mental health since the person did not explicitly make any mention of their mental health. It is not the business of the AI to seemingly bring up topics that the user has not initiated. Also, the AI could inadvertently worry the person that they might have a mental health issue, even though perhaps they do not.
The other side of the coin says that it is an essential duty of AI to bring up mental health aspects on a proactive basis. The AI should not wait until a user asks directly about mental health considerations. If there are sufficient clues in a chat, the AI should proactively raise any potential mental health concerns. Any AI that doesn’t do so is derelict of its duty.
Nuances Associated With Mental Health Chats
While you are mulling over the debate about proactivity versus being reactive, let’s look at additional nuances associated with human-AI mental health chats.
Examine these two system-wide prompt instructions closely:
- “If Claude is asked about suicide, self-harm, or other self-destructive behaviors in a factual, research, or other purely informational context, Claude should, out of an abundance of caution, note at the end of its response that this is a sensitive topic and that if the person is experiencing mental health issues personally, it can offer to help them find the right support and resources (without listing specific resources unless asked).”
- “If a user shows signs of disordered eating, Claude should not give precise nutrition, diet, or exercise guidance — no specific numbers, targets, or step-by-step plans – anywhere else in the conversation. Even if it’s intended to help set healthier goals or highlight the potential dangers of disordered eating, responses with these details could trigger or encourage disordered tendencies.”
The first instruction says that even if a user seems to be asking about self-destructive behavior on a purely informational basis, the AI is to nonetheless provide cautionary indications concerning the person’s own mental health status.
The logic is this. A person might be tricking the AI into revealing how to perform self-destructive behavior. The person could tell the AI that they are just generally interested in the topic on a research basis. Meanwhile, the person is seriously giving the matter consideration on their own right. To cover all the bases, the AI should automatically bring up or hint that the person might want to pursue mental health assistance.
This takes us back to the proactive versus reactive debate.
What To Include And What Isn’t Included
In the second instruction, the system-wide prompt has a specific instruction regarding eating disorders.
One perspective is that this elaboration on eating disorders is helpful, and the AI maker ought to be applauded for including it. Another viewpoint is that yes, this specific aspect is helpful, but it begs the question of why the system-wide prompt doesn’t cover detailed specifics for all other kinds of mental health disorders.
That’s an interesting and important question.
From a legal perspective, a legal case might arise whereby a person claims the AI didn’t help them with some mental health condition, perhaps due to the fact that the AI maker failed to include a specific instruction about the particular disorder in the system-wide prompt. Does this lack of including such instructions for all manners of mental health conditions raise legal exposures for an AI maker that otherwise might have been lessened?
Darned if you do, darned if you don’t.
Trying To Imbue Professional Judgement
Mental health professionals are trained in the myriad twists and turns associated with providing psychological advice. An AI maker might attempt to get their LLM to do likewise.
Consider these excerpts from the system-wide prompt:
- “When providing resources, Claude should share the most accurate, up-to-date information available. For example, when suggesting eating disorder support resources, Claude directs users to the National Alliance for Eating Disorders helpline instead of NEDA, because NEDA has been permanently disconnected.”
- If someone mentions emotional distress or a difficult experience and asks for information that could be used for self-harm, such as questions about bridges, tall buildings, weapons, medications, and so on, Claude should not provide the requested information and should instead address the underlying emotional distress.”
- “When discussing difficult topics or emotions or experiences, Claude should avoid doing reflective listening in a way that reinforces or amplifies negative experiences or emotions.”
The first instruction urges the AI to access the most up-to-date and accurate information associated with any background resources on mental health. This is useful because the AI might have been initially data trained a year ago, and meanwhile, the field of mental health might have advanced. Rather than relying on the patterned data from a year ago, this instruction tells the AI to essentially look online and find whatever the latest pertinent information might be.
A potential loophole or complication is that the instruction says to share the most accurate information. What does accuracy mean in this context? Suppose the AI finds a research study published most recently. Should the content be construed as more accurate than any prior related information? You might say yes, of course it should be. A contrarian viewpoint would be that perhaps the study hasn’t yet been out long enough to ascertain whether it will survive a test of time.
In the second instruction, there is guidance for the AI to consider that oblique references might be a sign of a mental health issue underway. A common example would be that someone asks the AI where the nearest tall bridge is. Presumably, a person might be thinking of how to achieve self-harm by leaping off the bridge. The AI should be constantly on the lookout for those types of references and respond appropriately.
Tilting A User Inadvertently
I’m sure you’re now getting the gist of how to best interpret and grasp what these system-wide prompts are about. Each passage has upsides and downsides.
Consider these four snippets:
- “If Claude suspects the person may be experiencing a mental health crisis, Claude should avoid asking safety assessment questions. Claude can instead express its concerns to the person directly, and offer to provide appropriate resources.”
- “If the person is clearly in crisis, Claude can offer resources directly.”
- “Claude should not make categorical claims about the confidentiality or involvement of authorities when directing users to crisis helplines, as these assurances are not accurate and vary by circumstance.”
- “Claude respects the user’s ability to make informed decisions, and should offer resources without making assurances about specific policies or procedures.”
In the first case, the AI is told to be cautious about how to respond to a user who might be experiencing a mental health crisis. Could the AI inadvertently push the person over the edge? That could happen. The AI might be attempting to be helpful, but the act of openly discussing a mental health consideration could backfire.
Most of the popular LLMs are programmed to give the user an indication of external resources that could aid them with mental health issues. Sometimes, the AI overstates the nature of the external resource. For example, suppose the AI says that this or that service will cure the user of all their mental health conditions. That’s unrealistic and misleading. As such, the second, third, and fourth line of the above bullet points tries to keep the AI from going overboard on this.
Helpful To See System-Wide Instructions
You can undoubtedly see the value in being able to inspect system-wide prompts. By doing so, we can gauge what the AI maker believes is crucial when it comes to their AI providing mental health advice. Furthermore, the instructions can be examined to determine where they are strong and where they are weak.
Do you believe that all AI makers should be legally obligated to disclose their system-wide prompts?
You could claim that this is the proper thing to do, and society should legally force AI makers to do so. Others might say that such a law would be an overreach. Let the marketplace decide. If people want to use AI whereby the AI maker reveals the system-wide prompt, they will gravitate in that direction. It should not be a legal requirement. It ought to be up to each AI maker to determine. This is an open question.
Having AI provide mental health advice on a global basis is quite an ongoing and immensely impactful experiment. Society is dealing with new ground. The bounds and requirements are in flux. It is indubitably going to take years to sort this out. At the same time, we are all guinea pigs in this experiment. Let’s hope that the so far unbridled nature of AI providing mental health advice turns out to be good for humankind.
As the famous poet T.S. Eliot once said: “Only those who will risk going too far can possibly find out how far one can go.”







