Meta’s Llama 3 is the latest iteration in its series of large language models, boasting significant advancements in AI capabilities.
Here are 10 essential facts about Llama 3:
1. Llama 3 introduces four new models based on the Llama 2 architecture, available in two sizes: 8 billion (8B) and 70 billion (70B) parameters. Each size offers a base model and an instruction-tuned version, designed to enhance performance in specific tasks. The instruction-tuned version is meant for powering chatbots that can have a conversation with users.
2. Meta AI, a brand new assistant from Meta, is powered by Llama 3. The chatbot is available on Meta AI on Facebook, Instagram, WhatsApp and Messenger. It is also embedded in the search experience across Facebook, Instagram, WhatsApp and Messenger.
3. All variants of Llama 3 support a context length of 8,000 tokens, allowing for more extended interactions and more complex input handling compared to many previous models. More tokens mean more content that includes both the input prompt from the users and the response from the model.
4. Llama 3 models are integrated into the Hugging Face ecosystem, making them readily available to developers. This integration includes tools like transformers and inference endpoints, facilitating easier adoption and application development. It is also available from model-as-a-service providers such as Perplexity Labs, Fireworks.ai and Fireworks.AI, as well as cloud provider platforms such as Azure ML and Vertex AI.
5. Alongside the Llama 3 models, Meta has released Llama Guard 2, a safety model fine-tuned on the 8B version, designed to improve the production use cases’ safety and reliability.
6. The Llama 3 models have shown impressive performance across various benchmarks. The 70B model, for instance, outperforms other high-profile models like OpenAI’s GPT-3.5 and Google’s Gemini on tasks including coding, creative writing and summarization.
7. The models were trained on a dataset comprising 15 trillion tokens, which is about seven times the size of the dataset used for Llama 2. This extensive training has significantly contributed to the models’ improved performance and capabilities.
8. Meta is actively developing more capable versions of Llama 3, with future models expected to exceed 400 billion parameters. These versions aim to support multiple languages and modalities, enhancing the model’s versatility and applicability across different regions and formats. The larger model variant is expected to become available later this year.
9. Meta continues to emphasize its commitment to the open-source community by making Llama 3 available for free. This approach not only fosters innovation but also allows for widespread testing and improvement by developers worldwide. Interestingly, Meta calls Llama 3 an openly accessible model without calling it an open source model.
10. Llama 3 models are optimized for hardware from Intel, AMD and NVIDIA. Intel has published a detailed guide on the performance of the model on its Gaudi AI accelerators and Xeon CPUs.
Llama 3 marks a significant step forward in the evolution of open models. Given Meta’s reach and deep partnerships with major industry players, the model is expected to gain widespread adoption in the coming months.