Meta Introduces Llama 4 Scout and Maverick AI Models Featuring MoE Architecture

Meta has launched its latest artificial intelligence models, the Llama 4 Scout and Llama 4 Maverick, marking a significant advancement in AI technology. Released on Saturday, these models feature native multimodal capabilities and are the first open models utilizing Mixture-of-Experts (MoE) architecture. Alongside these releases, Meta previewed the Llama 4 Behemoth, the largest model in the Llama 4 family, which is still under development.
Innovative Features of Llama 4 Models
The Llama 4 Scout and Llama 4 Maverick are designed to enhance performance and efficiency. The Scout model boasts 17 billion active parameters and operates with 16 experts, while the Maverick model also contains 17 billion active parameters but utilizes 128 experts. Notably, the Scout can run on a single Nvidia H100 GPU, making it accessible for various applications. Both models are open-source and can be downloaded from Hugging Face or the dedicated Llama website. Users can also interact with these AI models through platforms like WhatsApp, Messenger, Instagram Direct, and the Meta.AI website. Meta claims that the Llama 4 Behemoth, which features 288 billion active parameters and 16 experts, outperforms other leading models such as GPT-4.5 and Claude Sonnet 3.7 on several benchmarks. However, the Behemoth model is not yet available as it is still undergoing training.
Advanced Architecture and Training Techniques
The Llama 4 models utilize an innovative MoE architecture that activates only a portion of the total parameters based on the initial prompt. This design enhances computational efficiency during both training and inference. During the pre-training phase, Meta employed new techniques like early fusion to simultaneously integrate text and vision tokens, along with MetaP for setting critical model hyper-parameters. For post-training, Meta implemented a sequence of lightweight supervised fine-tuning (SFT), followed by online reinforcement learning (RL) and lightweight direct preference optimization (DPO). This approach was chosen to avoid over-constraining the model. The researchers focused SFT on only 50% of the more challenging datasets to optimize performance.
Benchmark Performance and Safety Measures
According to internal testing, the Maverick model has outperformed competitors like Gemini 2.0 Flash and GPT-4o across various benchmarks, including image reasoning and long context tasks. Similarly, the Scout model has shown superior performance against models such as Gemma 3 and Mistral 3.1 in multiple categories, including reasoning and knowledge assessments. Meta has prioritized safety in both the pre-training and post-training processes. During pre-training, the company implemented data filtering methods to exclude harmful information from the model’s knowledge base. In the post-training phase, open-source safety tools like Llama Guard and Prompt Guard were integrated to protect against external threats. Additionally, the models underwent rigorous internal stress testing and red-teaming exercises to ensure robustness.
Accessibility and Usage Restrictions
The Llama 4 models are available to the open community under a permissive Llama 4 license, allowing both academic and commercial use. However, Meta has imposed restrictions, preventing companies with over 700 million monthly active users from accessing these AI models. This move aims to balance innovation with responsible usage in the rapidly evolving AI landscape.
Observer Voice is the one stop site for National, International news, Sports, Editorโs Choice, Art/culture contents, Quotes and much more. We also cover historical contents. Historical contents includes World History, Indian History, and what happened today. The website also covers Entertainment across the India and World.
Follow Us on Twitter, Instagram, Facebook, & LinkedIn