Who’s That Talking?

While many people have grown accustomed to hearing Alexa, Google Assistant, and Siri answer questions, recent advancements have put conversational agents within reach of almost any company.

As a result, computers are talking to us seemingly everywhere.

But how do these conversational agents actually work? Boston Consulting Group (BCG), ranked as a leader in AI services by Forrester Research, gave me a peek behind the curtain of GENE—the conversational agent it has built as a proof of concept for clients.

BCG’s experience is instructive for others who want a conversational agent of their own.

A conversational agent is an advanced language model designed to engage in human-like conversations and assist with various tasks. These AI agents utilize deep learning techniques, such as transformer-based neural networks, to understand and generate natural language in a coherent and contextually relevant manner. They are trained on vast amounts of diverse data, allowing them to possess a broad knowledge base and provide informative, creative, and helpful responses to user queries.

Conversational agents can be used for a wide range of applications, including customer support, virtual assistance, content creation, and more. As they continue to evolve and improve, these AI agents are becoming increasingly sophisticated in their ability to understand and respond to the nuances of human communication.

Many leading AI research organizations, including OpenAI, view conversational AI systems like chatbots as a crucial first step on the path to more advanced artificial intelligence, potentially culminating in superintelligence. These language models serve as a foundation for developing and refining natural language processing, knowledge representation, and contextual understanding.

By mastering chatbot technology, companies gain valuable experience in handling large-scale language models, addressing issues of bias and safety, and exploring the ethical implications of AI deployment. This expertise becomes increasingly relevant as AI capabilities progress through subsequent levels of reasoning, task execution, and creative problem-solving.

Proficiency in chatbot development and deployment is seen not just as an end in itself, but as a strategic investment in preparing for the next stages of AI evolution.

GENE’s Innovative Approach

GENE is part of BCG’s “GENE Operating Model” (GENEOM) platform. It was initially created to co-host a podcast called “Imagine This” but has since been adapted for various purposes, including synthesis and contextualization of BCG research.

What sets GENE apart is its innovative use of widening context windows and faster inference as alternatives to vector databases and Retrieval-Augmented Generation (RAG). Instead of relying on external databases to fetch information during a conversation, GENE leverages a large context window to incorporate extensive information directly into its prompt.

GENE is built on top of large language models—currently GPT-4o—with additional fine-tuning and customization. It uses a custom voice created with ElevenLabs, intentionally designed to sound somewhat robotic for transparency.

The system doesn’t use a vector database, instead relying on fine-tuning and a “prime directive”—a long prompt that defines its basic behavior and knowledge. This prompt can be nearly 200 pages of text, taking up about 80-90% of the currently available token limit in the AI’s context window. This approach allows GENE’s users to amend its knowledge base or instructions on the fly.

“In the prompt, we’ve fed it BCG research and proprietary BCG interviews that are not in the public domain,” said Paul Michelman, BCG’s head of content who initiated the project. An administrator controls the prompt and the agent through a simple interface.

Widening Context Windows vs. Vector Databases

By utilizing a widened context window, GENE eliminates the need for external retrieval systems like vector databases. This simplifies the architecture. As context windows widen and inference speeds up, the architecture promises to become increasingly popular.

The model can directly reference information within its prompt, providing more coherent and contextually relevant responses by keeping all information within the model’s context.

“We have a context prompt at the beginning, then we feed the entire conversation history in, and then we have a suffix prompt,” explains Bill Moore, the BCG technologist who built GENE. Moore added that the token limit has increased so much that GENE can now have hours of conversation in one session. The use of the prompt window also allows the GENEOM platform to spawn different versions of GENE for various tasks. Each version has a specific prompt tailored to its role, such as podcast co-host or expert on BCG content.

There are different versions like “podcast GENE” and “audiobook GENE,” each with distinct personalities and knowledge bases. The team can adjust GENE’s “temperature” setting to control how creative or strict its responses are.

GENE is fed information using XML tags to structure content like BCG publications, podcast summaries, and expert interview transcripts.

Faster Inference and Practical Applications

The advancements in inference speed enable the practical use of larger context windows without significant performance hits. Optimization techniques and efficient transformer architectures make it feasible to process extensive prompts quickly.

For now, GENE is primarily used for co-hosting podcasts, synthesizing BCG content, participating in live events, demonstrating BCG’s AI capabilities, and assisting with internal brainstorming. BCG is also exploring using conversational AI in content development, such as interviewing BCG partners to create article outlines.

This experimental project leverages BCG’s extensive intellectual property and knowledge base, enabling GENE to engage users, answer questions, and elucidate complex concepts by drawing from BCG’s vast repository of research and expertise.

Future Implications and Ethical Considerations

The ability of AI agents like GENE to handle routine, data-intensive tasks allows human executives to focus on strategic decision-making and innovative thinking. BCG envisions a future where AI will enhance many of the executive roles in Fortune 500 companies.

They highlight the potential for enhanced decision-making through AI’s rapid data processing and round-the-clock operation, while also acknowledging the challenges of balancing machine logic with human intuition and addressing ethical concerns.

By relying on internal prompts and controlled knowledge bases, GENE’s approach can mitigate some risks associated with external data sources, such as data privacy and bias. However, achieving this vision requires careful consideration of computational resources required for larger context windows and ensuring data freshness compared to dynamic retrieval systems.

BCG has actively showcased GENE’s capabilities and potential impact through various initiatives, including their podcast series where GENE co-hosts episodes that explore future scenarios involving AI integration, such as the role of AI bots in the C-suite working alongside CEOs.

In the lead-up to the 2024 World Economic Forum in Davos, BCG Global Chair Rich Lesser and GENE provided insights on key issues that CEOs were contemplating, demonstrating the practical applications of conversational AI in high-level business discussions.

More recently, BCG launched the “CEO Digest” podcast, featuring a monthly “conversation” between Michelman and GENE on current topics for the CEO.

Overall, GENE represents BCG’s exploration into Generative AI applications, serving both as a functional tool for content creation and synthesis, and as a showcase of the company’s capabilities in this emerging field.

By leveraging widened context windows and faster inference, GENE offers a compelling alternative to traditional methods like vector databases and RAG, simplifying the architecture and enhancing performance.

“If you don’t build it, you don’t learn anything,” said technologist Moore. “Now you know how to build the next thing.”

Share.
Exit mobile version