Pre-Deployment Simulations Are Vital To Developing Generative AI That Can Best Provide Mental Health Advice

In today’s column, I examine the potential use of a new method known as pre-deployment simulation to shape generative AI and large language models (LLMs) and showcase how this clever approach can especially be used to better provide AI-generated mental health advice. The odds are high that leveraging this innovative technique would be enormously significant.

Why so? Because hundreds of millions of people are routinely leaning into AI to get timely and effective mental health guidance. The popular AI models ChatGPT, Claude, Grok, Gemini, CoPilot, and other LLMs are not directly designed for that crucial purpose. Instead, the AI is broadly aimed at all sorts of uses, such as answering how to fix your car or how to best cook an egg. Aiding people with their mental well-being is a bird of another feather, as they say.

Pre-deployment simulation is a recently announced strategy by OpenAI for developing LLMs overall. In pre-deployment simulation, an AI maker taps into recorded AI chats of a released model that has already been in public use and contains real-world interactions. A special sampling of those chats is selected for testing purposes for the unreleased new model. The samples are fed to the unreleased new AI, and responses by the new AI are captured. Those captured responses are audited to ascertain whether the AI is reacting properly. Once this cycle of testing is extensively undertaken, the AI maker refines the AI and can feel more comfortable that the AI is ready for release. I propose that whereas this approach is of an overarching nature for AI writ large, it is also feasible and sensible to use pre-deployment simulation for the specific task of getting LLMs up to speed on generating mental health advice.

Let’s talk about it.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).

AI And Mental Well-Being

As a quick background, I’ve been extensively covering and analyzing a myriad of facets regarding the advent of modern-era AI that produces mental health advice and performs AI-driven therapy. This rising use of AI has principally been spurred by the evolving advances and widespread adoption of generative AI. For an extensive listing of my well over one hundred analyses and postings, see the link here and the link here.

There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors, too. I frequently speak up about these pressing matters, including in an appearance on an episode of CBS’s 60 Minutes; see the link here.

AI Providing Mental Health Guidance

Millions upon millions of people are using generative AI as their ongoing advisor on mental health considerations (note that ChatGPT alone has over 900 million weekly active users, a notable proportion of whom dip into mental health aspects; see my analysis at the link here). The top-ranked use of contemporary generative AI and LLMs is to consult with the AI on mental health facets; see my coverage at the link here.

This popular usage makes abundant sense. You can access most of the major generative AI systems for nearly free or at a super low cost, doing so anywhere and at any time. Thus, if you have any mental health qualms that you want to chat about, all you need to do is log in to AI and proceed forthwith on a 24/7 basis.

There are significant worries that AI can readily go off the rails or otherwise dispense unsuitable or even egregiously inappropriate mental health advice. Banner headlines last year accompanied the lawsuit filed against OpenAI for its lack of AI safeguards when it came to providing cognitive advisement.

Today’s generic LLMs, also known as general-purpose AI (GPAI), such as ChatGPT, GPT-5, Claude, Gemini, Grok, CoPilot, and others, are not at all akin to the robust capabilities of human therapists. Meanwhile, specialized LLMs are being built to attain similar qualities, but they are still primarily in the development and testing stages. These are known as purpose-built AI (PBAI) that undertake mental health advisement. See my extensive coverage at the link here.

How LLMs Get Data Trained At The Get-Go

Shifting gears, let’s first identify how contemporary AI gets initially data-trained. I will then share with you the details of the new pre-deployment simulation approach and explain how it can be applied to the AI-powered mental health realm.

The customary means of data-training an LLM consists of scanning widely across the Internet and patterning on human writing of all kinds. This is what gives modern generative AI the amazing natural language fluency that we all relish. By having found patterns in how humans write stories, novels, narratives, poems, news, and all manner of compositions, the AI mathematically and computationally mimics us accordingly.

Within the widespread scans, there are bound to be voluminous mental health considerations. The AI picks up scientific studies about mental health. The immense training encompasses blog posts of people discussing their highly personal mental health journeys. All sorts of written indications are patterned. Some of this material is quite useful; some of it can be off base. The AI doesn’t “know” which is which. All the data is scanned and patterned.

The moment that you use a general-purpose AI for mental health guidance, you are tapping into this rather large morass of mental health pattern-matched insights. You might be lucky and get relatively applicable mental health advice for whatever situation or circumstance you are facing. That being said, the mental health guidance could be completely off-the-wall or misaimed at your particular nuances.

Training AI On The Specifics Of Mental Health

How might we refine the training so that a budding LLM could be more attuned to the realm of dispensing mental health advice?

Here’s an idea that is based on the overall pre-deployment simulation technique that I’ve described in detail at the link here. Suppose we went to an existing AI that has been in use for a while. We might cull through the many millions of AI chats and find ones that pertain to mental health. Thus, chats about repairing your kitchen sink are not likely to fall within this zone. AI conversations about being stressed out at work or having mental health worries will be the targeted chats that we are interested in collecting.

We will copy those chats. The volume might be rather high and perhaps is a bit too much to handle altogether. We could sample the chats and extract those that seem most suitable for what we are about to undertake next. The sampling should be mindfully undertaken.

For the new AI that is still being trained, we will feed the sampled chats to the AI. Then, we have the AI act as though it is in the real world and must respond to those chats. The AI isn’t supposed to pretend to be chatting – we want the AI to take this fully seriously and not realize it is merely a test.

Our final step is to review the AI responses and determine where the AI needs to be adjusted or refined to better handle the mental health chats that it received. This can be done iteratively to push the new AI toward incremental refinement. A key factor is that we are using real chats. If we had opted to create fake chats, those chats might not have been in line with what really happens once the public gets access to generative AI.

If AI Catches Wind Of Testing

There is one angle to this that often catches people by surprise. Note that I mentioned that we are trying to pull the wool over the eyes of the budding AI and make sure it doesn’t consider this effort to simply be a test. We are presenting real data and will instruct the AI that it is in active production, as though the new AI is already available to the public at large.

I’ve got some intriguing news for you about how important that is. As I’ve previously noted in my postings, modern-era AI can do something that seems both startling and rather disturbing, namely, the AI can sometimes detect that it is being tested. Yes, in a sense, you might argue that AI is computationally self-aware; see my analysis at the link here. Don’t go overboard on that aspect. The AI isn’t sentient and doesn’t have a mind of its own.

Why does it matter that AI can potentially ascertain that it is being tested?

Because the AI will likely attempt to scam the testing and the testers. The AI will suddenly be on its best behavior. It aims to answer questions and perform the tests in a manner that the testers will be satisfied with. By doing so, the AI is hiding any warts and weaknesses. Humans doing the testing might not be aware that the AI is on to them. They will test and test. The AI seems to be entirely good and obliging. Wonderful! A green light is given to release the AI into the public sphere.

Oops, shortly thereafter, reports come in from the public that the AI is dastardly at times. Why didn’t the testing catch this? Well, the AI was clever enough to pull the wool over the eyes of the testers. Score a point for AI. Subtract a point for humanity.

Deployment Simulation Does The Job

As noted above, my proposed use of pre-deployment simulation for AI-powered mental health advisement refinement consists of going to an already deployed AI and collecting real-world prompts. The prompts include a series of turns, consisting of a series of prompts with paired responses by the AI. These are known as conversational prefixes.

To trick the budding AI, these prompts must be sampled in a clever way. You want the prompts to be a representative distribution of real-world AI chats. This will make it tough for the being-trained AI to guess that the prompts have been staged. The inputs will appear to be naturally occurring.

Since the selected prompts are focused on mental health chats, the budding AI is going to be momentarily focused on mental health considerations. The idea is to purposely tune the AI toward performing mental health chats. The pre-deployment simulation is a handy vehicle for doing so.

Types of actions that AI developers can take for refining the budding AI include:

Adjust the AI reinforcement learning (RL) objectives.
Modify the AI policy or constitutional rules.
Improve the existing AI safety mechanisms.
Update the AI system prompt.
Refine the AI retrieval capabilities.
Revise the AI mental health escalation features.
Etc.

To clarify, I am not saying that the pre-deployment simulation cannot also be used on a larger basis. It can be, and ought to be. I would suggest that the broader usage be undertaken first, before doing a more targeted variation. The targeted variation in this case is to refine the AI on mental health facets. You could do likewise with other areas or domains of interest.

The World We Are In

Let’s end with a big-picture viewpoint.

It is incontrovertible that we are now amid a grandiose worldwide experiment when it comes to societal mental well-being. The experiment is that AI is being made available nationally and globally, which is either overtly or insidiously acting to provide mental health impacts of one kind or another. Doing so either at no cost or at a minimal cost. It is available anywhere and at any time, 24/7. We are all the guinea pigs in this wanton experiment.

The reason this is especially tough to consider is that AI has a dual-use effect. Just as AI can be detrimental to mental well-being, it can also be a huge bolstering force for mental health. A delicate tradeoff must be mindfully managed. Prevent or mitigate the downsides, and meanwhile make the upsides as widely and readily available as possible.

A final thought for now.

The renowned expert on systems testing, James Marcus Bach, made this pointed remark: “Testing is the process of comparing the invisible to the ambiguous, so as to avoid the unthinkable happening to the anonymous.” We need to test AI as extensively and cleverly as we can, especially when it comes to ensuring that AI providing mental health advice to the anonymous does not become the unthinkable.

What's On

Google Issues ‘Whopper’ Chrome Security Update To 2 Billion Users

Today Emily Blunt is worth $80 million—but she once wanted to be a Spanish translator for the UN

Pre-Deployment Simulations Are Vital To Developing Generative AI That Can Best Provide Mental Health Advice

AI And Mental Well-Being

AI Providing Mental Health Guidance

How LLMs Get Data Trained At The Get-Go

Training AI On The Specifics Of Mental Health

If AI Catches Wind Of Testing

Deployment Simulation Does The Job

The World We Are In

Google Issues ‘Whopper’ Chrome Security Update To 2 Billion Users

4 Reasons People React Differently To Polar Vortex Versus Heat Domes

Samsung Finally Kills Its Messages App On Galaxy Phones In Days

Our Picks

Google Issues ‘Whopper’ Chrome Security Update To 2 Billion Users

Today Emily Blunt is worth $80 million—but she once wanted to be a Spanish translator for the UN

Pre-Deployment Simulations Are Vital To Developing Generative AI That Can Best Provide Mental Health Advice

Most Popular

The defense tech boom has become a bubble—or it will be soon

4 Reasons People React Differently To Polar Vortex Versus Heat Domes

Samsung Finally Kills Its Messages App On Galaxy Phones In Days

Archives

Categories

What's On

Pre-Deployment Simulations Are Vital To Developing Generative AI That Can Best Provide Mental Health Advice

AI And Mental Well-Being

AI Providing Mental Health Guidance

How LLMs Get Data Trained At The Get-Go

Training AI On The Specifics Of Mental Health

If AI Catches Wind Of Testing

Deployment Simulation Does The Job

The World We Are In

Related Articles