Meta’s Vanilla MAVERICK AI MODEL ranks under competitors on a popular chat standard
Earlier this week, dead Fell To use an unpublished Llama 4 MAVERICK experimental version to achieve a high degree on collective standards, LM Arena. Incident LM ARNA supervisors pushed an apologyThey change their policies, registration of non -modified vanilla.
It turns out, he is not a very competitive.
Unsaturated, “Llama-4-MAVERICK-17B-128E-Instruct,” The models are classified below Including Openai’s GPT-4O and Claude 3.5 Sonnet of Anthropic and Gemini 1.5 Pro starting from Friday. Many of these models are months old.
The Llama 4 version was added to the LMARNA after it was discovered that they were deceived, but you may not see it because you have to pass down to the 32nd position, which is the place of ranks pic.twitter.com/a0bxkdx4lx
– ρ: ɡσN (pigeon__s) April 11, 2025
Why the weak performance? The company “Meta Experimental in Meta, Llama-4-MAVERICK-03-26-EXPERIMENTAL,” has been improved for trial, “as the company explained in A. Published graph Last Saturday. It is clear that these improvements played well with LM Arena, which have human residents compares the outputs of the models and choose what they prefer.
As we wrote beforeFor various reasons, LM Arena was not the most reliable scale for the performance of the artificial intelligence model. However, the design of a measurement model – besides being misleading – makes it difficult for developers exactly the extent of the model’s performance in different contexts.
In a statement, a Meta Techcrunch spokesman told the definition experiences with “all types of variables”.
“We have now released our open source version and we will see how developers allocate Llama 4 for their own use cases,” said Llama-4-MAVERICK-03-26-EXPERIMENTAL. We are excited to find out what they will see and look forward to their ongoing observations. “