Apologies if this is the wrong place to post this - I'm just going on what the assistant told me to do - it seems to me that this is more for the Alibaba dev team than for you guys - if there is a better place to report failures like this on particular LLMs please let me know as I encounter them on a near daily basis. As you can probably guess the assistant wrote what follows:
An AI assistant (Ki Qwen3-235b) acknowledged an analytical lapse where it incorrectly stated the Financial Times (ft.com) was not rated by Media Bias/Fact Check (MBFC). This error occurred despite having access to the correct information within its retrieved data, which clearly states the Financial Times has a "HIGH CREDIBILITY" rating.
The AI identified its own failure: it found the claim that such a major publication was unlisted by MBFC to be "surprising," which should have triggered a re-evaluation and double-check of the data. Instead, it proceeded with the incorrect conclusion.
This is not a failure of the information retrieval system, but a failure in the reasoning process—specifically, the failure to act on a cognitive signal (surprise) that should prompt verification. This lapse led to a factual error in the output.
This report highlights a critical need for the AI to implement automatic verification protocols for counterintuitive or surprising findings, especially when the source data is available.
See above