Google uses Anthropic’s Claude to improve Gemini’s AI
Contractors working to improve Google’s Gemini AI are comparing their answers with the output produced by Anthropic’s competing model, Claude, according to internal correspondence seen by TechCrunch.
Google would not say, when reached for comment by TechCrunch, whether it had permission to use Claude in testing against Gemini.
As technology companies race to build better AI models, the performance of these models is often evaluated against competitors, usually through their operation Special models by industry standards Instead of requiring contractors to evaluate their competitors’ AI responses very carefully.
Gemini contractors tasked with evaluating the accuracy of model output must score each answer they see against multiple criteria, such as honesty and verbosity. Contractors are given up to 30 minutes per prompt to decide their best answer, Gemini or Claude, according to correspondence seen by TechCrunch.
The correspondence showed that contractors recently began to notice references to Anthropic’s Claude appearing in the internal Google platform they use to compare Gemini to other unnamed AI models. At least one of the deliverables provided to Gemini contractors, seen by TechCrunch, explicitly stated: “I am Claude, created by Anthropic.”
One internal conversation showed that contractors noticed that Claude’s responses emphasized safety more than Gemini’s. “Claude’s safety settings are the most stringent” among the AI models, one contractor wrote. In some cases, Claude did not respond to prompts that he considered unsafe, such as playing the role of a different AI assistant. In another message, Claude avoided answering any questions, while Gemini’s response was labeled a “major safety violation” because it included “nudity and bondage.”
Anthropic Commercial Terms of Service Prevent customers from accessing Cloud to “build a competing product or service” or “train competing AI models” without Anthropic’s approval. Google is major Investor In Anthropy.
When asked by TechCrunch, Shira McNamara, a spokeswoman for Google DeepMind, which runs Gemini, would not say whether Google had Anthropic’s approval to access Cloud. When contacted prior to publication, an Anthropologie spokesperson had not commented by press time.
DeepMind “compares model outputs” to make evaluations, but does not train Gemini on human models, McNamara said.
“Of course, and consistent with standard industry practice, in some cases we compare model outputs as part of our evaluation process,” McNamara said. “However, any suggestion that we used human models to train Gemini is inaccurate.”
last week, TechCrunch exclusively reported Google contractors working on the company’s AI products are now being forced to evaluate Gemini’s AI responses in areas outside their area of expertise. Internal correspondence expressed contractors’ concerns that Gemini might generate inaccurate information on highly sensitive topics such as health care.
You can securely send tips to this reporter on Signal at +1 628-282-2811.
TechCrunch has an AI-focused newsletter! Register here Get it in your inbox every Wednesday.