In the course of human endeavors, it has become clear that humans have the capacity to accelerate learning by taking foundational concepts initially proposed by some of humanity’s greatest minds and building upon them. This concept was famously articulated by Sir Isaac Newton when he stated, “If I have seen further, it is by standing on the shoulders of giants”. Ironically, since he first wrote this in a letter to Robert Hooke in 1675, he is now widely regarded as one of those giants on whose shoulders humanity’s progress has been built.

Just like many of his contributions in the field of Physics, his sentiment has been proven many times over. Mathematical and scientific concepts that were once only the purview of PhDs are now being taught sometimes as early as elementary school – concepts like algebra, geometry and even the basics of thermodynamics. One of the keys to this type of educational acceleration is the use of examples, whether real or hypothetical, to demonstrate, reinforce and apply the learned concepts.

Likewise, humans are now applying this concept in the field of generative AI. As generative AI transitions from the experimentation phase into its value creation phase, the way foundation models are trained are also evolving. Just as humans learn how to learn as they become more sophisticated in a given subject, so too have some, like IBM Research teams in conjunction with their Red Hat counterparts, started to evolve how generative AI models learn with their recently launched InstructLab. In so doing, they are demonstrating significant acceleration in how foundational models can be customized for specific tasks.

Unlocking A New Way To Train

InstructLab is an open-source project which aims to lower the cost of fine-tuning LLMs by enabling the ability to integrate changes to an LLM without the need to fully retrain the entire foundation model.

According to a recent IBM blog post, the key to enabling this is not only using human-curated data and examples but also augmenting it with high quality synthetic examples generated by an LLM that mirror real-world data. Just like with humans, these examples provide a solid foundation for learning about the topic, thus significantly improving the model in a specific domain without having to fully retrain the core model. These synthetic data examples can help companies save time, effort and funds that they would need to spend to generate real data. By utilizing synthetic data, the InstructLab technique brings a new level of scale to customizing models.

With its recently released family of Granite models, using InstructLab, IBM was able to demonstrate a 20% higher code generation score along with a reduction in time it takes to achieve that quality. In a blog post summarizing IBM Research Director, Dario Gil’s keynote at this year’s Think conference, “Gil said that when IBM’s Granite code models were being trained on translating COBOL to Java, they had 14 rounds of fine-tuning that took nine months. Using InstructLab, the team added newly fine-tuned COBOL skills in a week, requiring only one round of tuning to achieve better performance.”

This was achieved by using examples of human written, paired COBOL-Java programs as seed data. The seed data was then augmented by using InstructLab to convert an IBM Z manual and various programming textbooks into additional, synthetically generated COBOL-Java pairs. The new data was then fed into the core Granite model resulting in the aforementioned fine-tuning acceleration.

For reference, Granite models are IBM’s family of large language models (LLMs) aimed at improving the productivity of human programmers. These LLMs come in different parameter sizes and apply generative AI to multiple modalities, including language and code. Granite foundation models are being fine-tuned to create assistants that help with translating code from legacy languages to current ones, debugging code and writing novel code based on plain English instructions. With IBMs focus on enterprise-class generative AI, Granite models have been trained on datasets encompassing not only code generation but also datasets covering academics, legal and finance.

Standing On The Shoulders Of Giants

It is clear that the large scale training of new foundation models has had a profound impact on generative AI and what humanity can do with those models. It is now time to build on that impact as foundation models are brought to bear on real world use cases and applications that provide value – especially for enterprises. However, standard training methods used to develop those foundation models require enormous amounts of data center resources requiring substantial capital as well as operational costs. In order for these foundation models to deliver on the promise of generative AI, companies need to rethink their model training processes. To deploy AI models at scale, fine-tuning techniques need to evolve to include more domain-specific data at a lower cost. With the results that have been demonstrated so far, it appears that IBM and Red Hat’s InstructLab project is doing just that. Time will tell exactly how far enterprises will be able to see standing on the shoulders of these particular giants.

Share.
Exit mobile version