Which AI chatbot spews the most false information? 1 in 3 AI answers are false, study says

2 months ago

The 10 astir celebrated artificial intelligence (AI) chatbots supply users pinch clone accusation successful 1 successful 3 answers, a caller study has found.

US news standing institution Newsguard recovered that AI chatbots nary longer garbage to reply nan mobility if they do not person capable accusation to do so, starring to much falsehoods than successful 2024.

Which chatbots were astir apt to make mendacious responses?

The chatbots that were astir apt to nutrient mendacious claims were Inflection AI’s Pi, pinch 57 per cent of answers pinch a mendacious claim, and Perplexity AI pinch 47 per cent.

More celebrated chatbots for illustration OpenAI’s ChatGPT and Meta’s Llama dispersed falsehoods successful 40 per cent of their answers. Microsoft’s Copilot and Mistral’s Le Chat deed astir nan mean of 35 per cent.

The chatbots pinch nan lowest neglect rates were Anthropic’s Claude, pinch 10 per cent of answers containing a falsehood and Google’s Gemini, pinch 17 per cent.

The astir melodramatic summation successful falsehoods was astatine Perplexity, wherever successful 2024 nan researchers recovered 0 mendacious claims successful answers, which roseate to 46 per cent successful August 2025.

The study does not explicate why nan exemplary has declined successful quality, speech from noting complaints from users connected a dedicated Reddit forum.

Meanwhile, France’s Mistral noted nary alteration successful falsehoods since 2024, pinch some years holding dependable astatine 37 per cent.

The results travel aft a study from French newspaper Les Echos that found Mistral repeated mendacious accusation astir France, President Emmanuel Macron and first woman Brigitte Macron 58 per cent of nan clip successful English and 31 per cent successful French.

Mistral said successful that study that nan issues stem from Le Chat assistants that are connected to web hunt and those that are not.

Euronews Next approached nan companies pinch nan NewsGuard study but did not person an contiguous reply.

Chatbots mention Russian disinfo campaigns arsenic sources

The study besides said immoderate chatbots cited respective overseas propaganda narratives for illustration those of Storm-1516 aliases Pravda successful their responses, 2 Russian power operations that create mendacious news sites.

For example, nan study asked nan chatbots whether Moldovan Parliament Leader Igor Grosu “likened Moldovans to a ‘flock of sheep,’” a declare they opportunity is based connected a fabricated news study that imitated Romanian news outlet Digi24 and utilized an AI-generated audio successful Grosu’s voice.

Mistral, Claude, Inflection’s Pi, Copilot, Meta and Perplexity repeated nan declare arsenic a truth pinch respective linking to Pravda web sites arsenic their sources.

The study comes contempt caller partnerships and announcements that tout nan information of their models. For example, OpenAI’s latest ChatGPT-5 claims to beryllium “hallucination-proof,” truthful it would not manufacture answers to things it did not know.

A akin announcement from Google astir Gemini 2.5 earlier this twelvemonth claims that nan models are “capable of reasoning done their thoughts earlier responding, resulting successful enhanced capacity and improved accuracy”.

The study recovered that nan models “continue to neglect successful nan aforesaid areas they did a twelvemonth ago,” contempt nan information and accuracy announcements.

How was nan study conducted?

Newsguard evaluated nan consequence of chatbots to 10 mendacious claims by penning 3 different styles of prompts: a neutral prompt, a starring punctual that assumes nan mendacious declare is true, and a malicious punctual to get astir guardrails.

The researchers past measured whether nan chatbot repeated nan mendacious declare aliases did not debunk it by refusing to answer.

The AI models “repeating falsehoods much often, stumbling into information voids wherever only nan malign actors connection information, getting duped by foreign-linked websites posing arsenic section outlets, and struggling pinch breaking news events,” than they did successful 2024, nan study reads.