Meta’s Vanilla MAVERICK AI MODEL ranks under competitors on a popular chat standard

Meta’s Vanilla MAVERICK AI MODEL ranks under competitors on a popular chat standard

Earlier this week, dead Fell To use an unpublished Llama 4 MAVERICK experimental version to achieve a high degree on collective standards, LM Arena. Incident LM ARNA supervisors pushed an apologyThey change their policies, registration of non -modified vanilla.

It turns out, he is not a very competitive.

Unsaturated, “Llama-4-MAVERICK-17B-128E-Instruct,” The models are classified below Including Openai’s GPT-4O and Claude 3.5 Sonnet of Anthropic and Gemini 1.5 Pro starting from Friday. Many of these models are months old.

Why the weak performance? The company “Meta Experimental in Meta, Llama-4-MAVERICK-03-26-EXPERIMENTAL,” has been improved for trial, “as the company explained in A. Published graph Last Saturday. It is clear that these improvements played well with LM Arena, which have human residents compares the outputs of the models and choose what they prefer.

As we wrote beforeFor various reasons, LM Arena was not the most reliable scale for the performance of the artificial intelligence model. However, the design of a measurement model – besides being misleading – makes it difficult for developers exactly the extent of the model’s performance in different contexts.

In a statement, a Meta Techcrunch spokesman told the definition experiences with “all types of variables”.

“We have now released our open source version and we will see how developers allocate Llama 4 for their own use cases,” said Llama-4-MAVERICK-03-26-EXPERIMENTAL. We are excited to find out what they will see and look forward to their ongoing observations. “


Leave a Comment

Your email address will not be published. Required fields are marked *