EXCLUSIVE: Google’s Gemini is forcing contractors to evaluate AI responses outside of their expertise

EXCLUSIVE: Google’s Gemini is forcing contractors to evaluate AI responses outside of their expertise

Generative AI may seem like magic, but behind the development of these systems are armies of employees at companies like Google, OpenAI, and others, known as “instant engineers” and analysts, who evaluate the accuracy of chatbots’ outputs to improve their AI.

But a new internal guideline passed from Google to contractors working at Gemini, and seen by TechCrunch, has led to concerns that Gemini may be more likely to spread inaccurate information about highly sensitive topics, such as health care, to ordinary people.

To improve Gemini, contractors are working with GlobalLogic, an outsourcing company Owned by HitachiThey are routinely asked to rate AI-generated responses on factors such as “honesty.”

Until recently, these contractors were able to “skip” certain claims, and thus opt out of evaluating various AI-scripted responses to those claims, if the claim was outside their area of ​​expertise. For example, a contractor may skip a claim asking a specialized question about heart disease because the contractor does not have a scientific background.

But last week, GlobalLogic announced a change by Google where contractors will no longer be allowed to skip such claims, regardless of their own experience.

Internal correspondence seen by TechCrunch shows that previously, the guidelines stated: “If you do not have the significant experience (e.g. programming, mathematics) to evaluate this claim, please skip this assignment.”

But now the guidelines state: “You should not skip claims that require specialist knowledge of the field.” Instead, contractors are asked to “rate which parts of the claim you understand” and include a note that they have no domain knowledge.

This has led to immediate concerns about Gemini’s accuracy on certain topics, as contractors are sometimes tasked with evaluating high-tech AI responses on issues such as rare diseases for which they have no background.

“I thought the point of the skip was to increase accuracy by giving it to someone better?” One contractor was mentioned in internal correspondence, seen by TechCrunch.

Contractors can now skip claims only in two cases: if they are “completely missing information” such as the full claim or response, or if they contain harmful content that requires special consent forms to evaluate, the new guidance shows.

Google did not respond to TechCrunch’s requests for comment as of press time.

Leave a Comment

Your email address will not be published. Required fields are marked *