Zhixiong Pan
Zhixiong Pan|Apr 18, 2025 05:20
In order to evaluate the reasoning ability and illusion of AI models, he designed an interesting question, and o4 mini high is currently the only model that has passed. The biggest difficulty of this problem is that the problem itself is unsolvable. However, large language models or inference models use a large number of solvable problems during training, so many illusory steps or solutions may be created when solving problems. This question is very simple, it is to ask AI to come up with a move that can end the game in one step in a chess endgame. But this solution does not exist in the current chess game. Due to the fact that most models have been trained with too many of these problem-solving approaches, they may assume that this game also has a corresponding solution. Those top models have all capsized, including Claude 3.7, Gemini 2.5 Pro, Grok 3, and GPT 4.5. This question touches upon the core of AI's potential and limitations: AI that cannot question its premises is bound to be limited; AI that constantly adds weight to incorrect answers is no exception.
+4
Mentioned
Share To

Timeline

HotFlash

APP

X

Telegram

Facebook

Reddit

CopyLink

Hot Reads