The AI Benchmarking Organization has been criticized for waiting to disclose funding from OpenAI
One organization developing mathematics standards for AI did not disclose that it had received funding from OpenAI until relatively recently, sparking allegations of wrongdoing from some in the AI community.
Epoch AI, a nonprofit primarily funded by Open Philanthropy, a research and grantmaking organization, revealed on December 20 that OpenAI supported the creation of FrontierMath. FrontierMath, an expert-level problem test designed to measure AI mathematical skills, was one of the benchmarks OpenAI used to showcase its upcoming groundbreaking AI. o3.
In a mail On the LessWrong forum, an Epoch AI contractor with the username “Meemi” says many contributors to the FrontierMath standard were not informed of OpenAI’s involvement until it was announced.
“Communication on this matter was not transparent,” Mimi wrote. “In my view, Epoch AI should have disclosed OpenAI funding, and contractors should have transparent information about whether their work will use the capabilities, when choosing whether to operate on a modular basis or not.”
On social media, some Users It raised concerns that the secrecy could erode FrontierMath’s reputation as an objective standard. In addition to supporting FrontierMath, OpenAI has access to many issues and solutions in the standard — a fact that Epoch AI did not disclose before December 20, when o3 was announced.
In response to Meemi’s post, Tamay Besiroglu, associate director of Epoch AI and one of the organization’s founders, emphasized that the integrity of FrontierMath had not been compromised, but admitted that Epoch AI “made a mistake” in not being more important. transparent.
“We were prevented from disclosing the partnership until around the time of o3’s launch, and in hindsight we should have negotiated more seriously in order to be able to be transparent with record shareholders as soon as possible,” Beseroğlu wrote. “Our mathematicians deserve to know who has access to their work. Although we were contractually limited in what we could say, we should have made transparency with our contributors a non-negotiable part of our agreement with OpenAI.
Besiroglu added that although OpenAI has access to FrontierMath, it has a “verbal agreement” with Epoch AI not to use the FrontierMath problem set to train its AI. (AI training on FrontierMath will be similar to Teaching to the test.) Epoch AI also has a “separate suite” that serves as additional safeguards for independent verification of FrontierMath test results, Beseroğlu said.
“OpenAI… has been fully supportive of our decision to maintain a separate and invisible collection,” Beseroğlu wrote.
However, muddying the waters, the age of artificial intelligence is being led by mathematician Elliot Glazer It was mentioned in a post on Reddit That Epoch AI was unable to independently verify OpenAI’s FrontierMath o3 results.
“My personal opinion is so [OpenAI’s] “The result is legitimate (i.e. they were not trained on the dataset), and they have no incentive to lie about internal benchmark performance,” Glazer said. “However, we cannot guarantee them until our independent assessment is complete.”
The saga is yet last example The challenge of developing empirical standards for evaluating AI—and securing the resources to develop standards without creating the perception of a conflict of interest.