AI, and in particular Generative AI, solutions are making their way into all types of products and services that we’re using increasingly on a daily basis. Those entities and organizations that hold the greatest market share in the models that are used as a basis for these daily apps, known as Foundation Models, will hold the keys to the kingdom in the fast-paced, gold rush mentality that is Generative AI.
Foundation models becoming the new AI platform
Foundation models are not just an AI model that’s been built to handle a single task, such as has been the case for previous models. Foundation models have been trained on such a large corpus of data that it is broadly applicable to a very wide range of tasks and performs those tasks at levels better than most single-purpose models. For example, the Foundational Large Language Models (LLMs) that are really popular today can perform any natural language processing (NLP) task, not just a single task, such as sentiment analysis, chatbots and conversational systems, Q&A on documents, document analysis, and even OCR of images. One big model can do all these things, which has made them amazingly compelling and pushed Generative AI to the forefront of every AI conversation today.
The most popular foundation models are rapidly becoming household names: OpenAI’s GPT models, and especially its ChatGPT application based on top of them, Google’s Gemini model, Anthropic Claude, and a raft of others. Many of these models, including the ones aforementioned, are proprietary in that the underlying training data, the trained model parameters, model tuning settings and hyperparameters, and other aspects are closely held and not revealed.
Things can potentially go wrong when you embed someone else’s AI models in your systems, especially when you have little control or visibility into those models. Yet, despite the increasingly walled garden of proprietary large language models, organizations aren’t yet fully aware of the potential drawbacks of using non-open machine learning models. After all, most organizations and individuals are still very new and green with AI experience.
Indeed, when organizations say they are using AI or building AI systems, most of the time they are making API calls to the cloud-hosted versions of the proprietary models. Sometimes, organizations are running models locally or on their own infrastructure, but the super-fast pace of AI iteration is motivating companies to take expedient measures to just use what’s available now. The need for local control or customization of models being pushed to later iterations or never.
The Government is Paying Closer Attention to Proprietary Foundation Models… and the investments in them
Technology incumbent vendors such as Microsoft, Amazon, Google, Meta, Oracle, IBM, and others realize correctly that AI foundation models are the platforms of the future, and if they want to have a stake in the future, they need to have a finger in the pies of each of the major AI model players. This is what has driven eye-popping investments in companies like OpenAI, Anthropic, Mistral, and others which have raised tens of billions of dollars in partnership investments, far outpacing even what the venture capital community has been able to invest.
As discussed in greater detail in an AI Today podcast on this topic, these vendors are in the race for AI dominance. And that really should come as no surprise, since AI is the biggest market opportunity in the past decade. There’s really only two ways to dominate these markets with new technology, and that’s build or buy. However, the industry is moving way too fast to build your way to dominance. So, these vendors need to buy or strategically partner to get their advantage. (disclosure: I am a managing partner and co-host of the AI Today podcast)
This heavy partnership and investment activity has caught the attention of federal regulators. The Federal Trade Commission (FTC), recently launched an inquiry into these generative AI investments and partnerships. They’re interested in scrutinizing the competitive dynamics of the rapidly emerging AI market to try to ensure innovation and fairness, and to gain a better understanding of the market trends and practices to see what might be anti competitive behavior. But there’s no need to wait for regulatory action that may not be forthcoming to decide your approach to making sure you have greater freedom of choice and control in the use and development of AI models.
What are the Alternatives to Open AI and Other Emerging AI Foundation Models?
The challenge with proprietary models is that if organizations become dependent on them for their AI capabilities, they will be limited in their flexibility and freedom in terms of how the models are deployed, configured, and tailored for their specific purposes. This is because if you don’t have the underlying data, model parameters, and additional details, you are not really capable of rebuilding those models, but rather can only use them as a consumer.
The foundation models can change in the way they work, they can change in their performance capabilities, and face issues in models refusing to respond to prompts and inputs as a form of content moderation, restricting the kinds of data you can put into or take out to them. If it’s not your platform, you have no control over it. Organizations are starting to see the problems with this approach.
Organizations increasingly have greater choices when it comes to the foundation models they use in their applications. The fast pace of AI innovation is resulting in models developed both in open source models that share all aspects of data, training information, model parameters, and more, as well as other proprietary models or fine-tuned versions of the models already in use.
The reason for the emergence of new models is because proprietary models in many ways are the AI version of vendor lock-in. Elon Musk notably sued Open AI for what he saw as a breach of their original promise of development of AI technology “in the open”, and very recently released his own Grok models as open source as a counterpoint to the proprietary models on the market.
What is the difference between Open Source and Proprietary AI?
This has led to the growth and emergence of open source AI foundation models. Open source models are available for download, use and embedding for free, and as well, you can also peer into the model’s code, the details, the weights and parameters, and the final trained model so you can make the changes yourself. You can also see all aspects of this. Just like much of the technology base on which AI systems run are open source, so too is the push to have the models generated by those systems also be open and available. There’s a lot of talk about transparency, especially within the context of building of Trustworthy AI solutions.
Many of these open source models are performing just as well as not-so-open alternatives on the market. Many open source models perform equivalently or even better in some cases than the proprietary commercial ones. Open source solutions allow you to have the freedom in how you deploy and operationalize the models, as well as the ability for deeper and greater fine-tuning or retraining on your own proprietary data without sharing that data with a third-party vendor. Since open source large language models are transparent, you can customize them to your particular needs more easily than with the closed versions. And open source has an active community and an active community support that fosters innovation as well.
Some of the emerging and popular open source models include LlaMa / Llama 2, Bloom, MosaicML MPT-7B, Falcon, GPT-NeoX and GPT-J, PaLM2, Dolly, Vicuna, OPT-175 B, and increasingly many more. This is an area of continued innovation and development, and an area in which we will most definitely see more, potentially heated, attention.
An open source large language model that you can own and run means that you’re in control of the data you share and the maintenance of privacy on your prompts and your large language model responses. This has come under scrutiny a lot lately with organizations and government agencies saying they are not allowing their employees to be able to use hosted LLM models due to data privacy and security requirements.
Also, there’s potential for cost savings and reduced vendor dependency with open source models. Without having license costs to use your models, open source large language models can be less costly over the long term, especially as you embed them increasingly with a greater number of systems. However, like all open source technology, you have to carry the cost of implementing and running the models, which can be potentially significant.