OpenAI employees publicly accuse Grok3's benchmark test results of being misleading

PANews|Feb 23, 2025 03:11
According to a report by Jin Shi, a OpenAI employee recently publicly accused xAI, a subsidiary of Elon Musk, of misleading benchmark test results for its latest AI model, Grok3. Regarding this, Igor Babushkin, co-founder of xAI, insists that the company has no wrongdoing. According to xAI's chart, the two versions of Grok3- Grok3 Reasoning Beta and Grok3 mini Reasoning - outperformed OpenAI's current strongest available model, o3-mini high, on AIME 2025. However, OpenAI employees quickly pointed out on the X platform that xAI's chart did not include o3-mini high“ cons@64 ”AIME 2025 score under certain conditions. Babushkin argued on the X platform that OpenAI has also released similar misleading benchmark charts in the past. Although these charts are used to compare the performance of their own models.
Share To
Timeline
HotFlash
APP
X
Telegram
CopyLink