In the short time since artificial intelligence hit the mainstream, its power to do the previously unimaginable is already clear. But along with that staggering potential comes the possibility of AIs being unpredictable, offensive, even dangerous. That possibility prompted Google CEO Sundar Pichai to tell employees that developing AI responsibly was a top company priority in 2024. Already we’ve seen tech giants like Meta, Apple, and Microsoft sign on to a U.S. government-led effort to advance responsible AI practices. The U.K. is also investing in creating tools to regulate AI—and so are many others, from the European Union to the World Health Organization and beyond.
This increased focus on the unique power of AI to behave in unexpected ways is already impacting how AI products are perceived, marketed, and adopted. No longer are firms touting their products using solely traditional measures of business success—like speed, scalability, and accuracy. They’re increasingly speaking about their products in terms of their behavior, which ultimately reflects their values. A selling point for products ranging from self-driving cars to smart home appliances is now how well they embody specific values, such as safety, dignity, fairness, harmlessness, and helpfulness.
In fact, as AI becomes embedded across more aspects of daily life, the values upon which its decisions and behaviors are based emerge as critical product features. As a result, ensuring that AI outcomes at all stages of use reflect certain values is not a cosmetic concern for companies: Value-alignment driving the behavior of AI products will significantly impact market acceptance, eventually market share, and ultimately company survival. Instilling the right values and exhibiting the right behaviors will increasingly become a source of differentiation and competitive advantage.
But how do companies go about updating their AI development to make sure their products and services behave as their creators intend them to? To help meet this challenge we have divided the most important transformation challenges into four categories, building on our recent work in Harvard Business Review. We also provide an overview of the frameworks, practices, and tools that executives can draw on to answer the question: How do you get your AI values right?
1. Define your values, write them into the program—and make sure your partners share them too
The first task is to determine whose values should be taken into account. Given the scope of AI’s potential impact on society, companies will need to consider a more diverse group of stakeholders than they normally would. This extends beyond employees and customers to include civil society organizations, policymakers, activists, industry associations, and others. The preferences of each of these stakeholders will need to be understood and balanced.
One approach is to embed principles drawing on established moral theories or frameworks developed by credible global institutions, such as UNESCO. The principles of Anthropic’s Claude model, for example, are taken from the United Nations’ Universal Declaration of Human Rights. BMW, meanwhile, derives its AI values from EU requirements for trustworthy AI.
Another approach is to articulate one’s own values from scratch, often by assembling a team of specialists (technologists, ethicists, and human rights experts). For instance, the AI research lab DeepMind elicited feedback based on the philosopher John Rawls’s idea of a “veil of ignorance,” in which people propose rules for a community without any knowledge of how the rules will affect them individually. DeepMind’s results were striking in that they focused on how AI can help the most disadvantaged, making it easier to get user’s buy-in.
Identifying the right values is a dynamic and complex process that must also respond to evolving regulation across jurisdictions. But once those values are clearly defined, companies will also need to write them into the program to explicitly constrain AI behavior. Companies like Nvidia and OpenAI are developing frameworks to write formal generative-AI guardrails into their programs to ensure they don’t cross red lines by carrying out improper requests or generating unacceptable content. OpenAI has in fact differentiated its GPT-4 model by its improved values, marketing it as 82% less likely than its predecessor model to respond to improper requests, like generating hate speech or code for malware.
Crucially, alignment with values requires the further step of bringing partners along. This is particularly important (and challenging) for products created with third-party models because of the limitations on how much companies may fine-tune them. Only the developers of the original models know what data was used in training them. Before launching new partnerships, AI developers may need to establish processes to unearth the values of external AI models and data, similar to how companies assess potential partners’ sustainability. As foundational models evolve, companies may need to change the models they rely upon, further entrenching values-based AI due diligence as a source of competitive advantage.
2. Assess the tradeoffs
Companies are increasingly struggling to balance often competing values. For example, companies that offer products to assist the elderly or to educate children must consider not only safety but also dignity and agency. When should AI not assist elderly users so as to strengthen their confidence and respect their dignity? When should it help a child to ensure a positive learning experience?
One approach to this balancing act is to segment the market according to values. A company like DuckDuckGo does that by focusing on a smaller search market that cares more about privacy than algorithmic accuracy, enabling the company to position itself as a differentiated option for internet users.
Managers will need to make nuanced judgments about whether certain content generated or recommended by AI is harmful. To guide these decisions, organizations need to establish clear communication processes and channels with stakeholders early on to ensure continual feedback, alignment, and learning. One way to manage such efforts is to establish an AI watchdog with real independence and authority within the company.
3. Ensure human feedback
Maintaining an AI product’s values, including addressing biases, requires extensive human feedback on AI behavior, data that will need to be managed through new processes. The AI research community has developed various tools to ensure that trained models accurately reflect human preferences in their responses. One foundational approach, utilized by GPT-3, involves “supervised fine-tuning” (SFT), where models are given carefully curated responses to key questions. Building on this, more sophisticated techniques like “reinforcement learning from human feedback” (RLHF) and “direct preference optimization” (DPO) have made it possible to fine-tune AI behaviors in a more iterative feedback loop based on human ratings of model outputs.
What is common to all these fine-tuning methodologies is the need for actual human feedback to “nudge” the models towards greater alignment with the relevant values. But who provides the feedback and how? At early stages, engineers can provide feedback while testing the AI’s output. Another practice is to create “red teams” who act as adversaries and test the AI by pushing it toward undesirable behavior to explore how it may fail. Often these are internal teams, but external communities can also be leveraged.
In some instances, companies can turn to users or consumers themselves to provide valuable feedback. Social media and online gaming companies, for example, have established content-moderation and quality-management processes as well as escalation protocols that build on user reports of suspicious activity. The reports are then reviewed by moderators that follow detailed guidelines in deciding whether to remove the content.
4. Prepare for surprises
As AI systems become larger and more powerful, they can also display more unexpected behaviors. Such behaviors will increase in frequency as AI models are asked to perform tasks they weren’t explicitly programmed for and endless versions of an AI product are created, according to how each user interacts with it. The challenge for companies will be ensuring that all those versions remain aligned.
AI itself can help mitigate this risk. Some companies already deploy one AI model to challenge another with adversarial learning. More recently, tools for out-of-distribution (OOD) detection have been used to help AI with things it has not encountered before. The chess-playing robot that grabbed a child’s hand because it mistook it for a chess piece is a classic example of what could happen. What OOD tools do is help the AI “know what it doesn’t know” and abstain from action in situations that it has not been trained to handle.
While impossible to completely uproot, the risk associated with unpredictable behavior can be proactively managed. The pharmaceutical sector faces a similar challenge when patients and doctors report side effects not identified during clinical trials, often leading to removing approved drugs from the market. When it comes to AI products, companies must do the same to identify unexpected behaviors after release. Companies may need to build specific AI incident databases, like those the OECD and Partnership on AI have developed, to document how their AI products evolve.
Conclusion
As AI becomes more ubiquitous, companies’ values—how to define, project, and protect them—rise in importance as they ultimately shape the way AI products behave. For executives, navigating a rapidly changing values-based marketplace where unpredictable AI behaviors can determine acceptance and adoption of their products can be daunting. But facing these challenges now by delivering trustworthy products that behave in line with your values will lay the groundwork for building lasting competitive advantage.
***
Read other Fortune columns by François Candelon.
François Candelon is a managing director and senior partner of Boston Consulting Group and the global director of the BCG Henderson Institute (BHI).
Jacob Abernethy is an associate professor at the Georgia Institute of Technology and a cofounder of the water analytics company BlueConduit.
Theodoros Evgeniou is professor at INSEAD, BCG Henderson Institute Adviser, member of the OECD Network of Experts on A.I., former World Economic Forum Partner on A.I., and cofounder and chief innovation officer of Tremau.
Abhishek Gupta is the director for responsible AI at Boston Consulting Group, a fellow at the BCG Henderson Institute, and the founder and principal researcher of the Montreal AI Ethics Institute.
Yves Lostanlen has held executive roles at and advised the CEOs of numerous companies, including AI Redefined and Element AI.
Some of the companies featured in this column are past or current clients of BCG.