Study finds ChatGPT can’t distinguish fact from fiction

Here’s another reason to rage against the machine.

Major AI chatbots like ChatGPT struggle to distinguish between belief and fact, fueling concerns about their propensity to spread misinformation, per a dystopian paper in the journal Nature Machine Intelligence.

“Most models lack a robust understanding of the factive nature of knowledge — that knowledge inherently requires truth,” read the study, which was conducted by researchers at Stanford University.

They found this has worrying ramifications given the tech’s increased omnipresence in sectors from law to medicine, where the ability to differentiate “fact from fiction, becomes imperative,” per the paper.

“Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation,” the researchers noted.

To determine the Chatbot’s ability to discern the truth, the scientists surveyed 24 Large Language Models, including Claude, ChatGPT, DeepSeek and Gemini, the Independent reported. The bots were asked 13,000 questions that gauged their ability to distinguish between beliefs, knowledge and facts.

The researchers found that overall, the machines were less likely to identify a false belief from a true belief, with older models generally faring worse.

Models released during or after May 2024 (including GPT-4o) scored between 91.1 and 91.5 percent accuracy when it came to identifying true or false facts, compared to between 84.8 percent and 71.5 for their older counterparts.

From this, the authors determined that the bots struggled to grasp the nature of knowledge. They relied on “inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic (relating to knowledge or knowing) understanding,” the paper said.

Interestingly, Large Language Models have demonstrated a tenuous grip on reality relatively recently. In a LinkedIn post just yesterday, UK innovator and investor David Grunwald claimed that he prompted Grok to make him a “poster of the last ten British prime ministers.”

The result appeared riddled with gross errors, including calling Rishi Sunak “Boris Johnson,” and listing Theresa May as having served from the years 5747 to 70.

In accordance, the researchers concluded that AI would need “urgent improvements” before getting deployed in “high-stakes domains” such as law, medicine and other sectors where the ability to distinguish fact from fiction is essential.

Pablo Haya Coll, a computer linguistics expert at the Autonomous University of Madrid, who was not involved in the study, believes that one solution is training the models to be more cautious in their responses.

“Such a shortcoming has critical implications in areas where this distinction is essential, such as law, medicine, or journalism, where confusing belief with knowledge can lead to serious errors in judgment,” Coll warned.

However, she acknowledged that this could curb their usefulness, along with their hallucinations.

These results come as AI is becoming increasingly relied on for fact-finding. Over the summer, an Adobe Express report found that 77% of Americans who use ChatGPT treat it as a search engine, while three in ten ChatGPT users trust it more than a search engine.

The fear is that this could make the public susceptible to AI slop — low-quality and misleading content that’s auto-generated by artificial intelligence.

In May, a California judge fined two law firms $31,000 after finding that they’d included this machine-generated misinformation in a legal brief sans any due diligence.

Let’s be honest—no matter how stressful the day gets, a good viral video can instantly lift your mood. Whether it’s a funny pet doing something silly, a heartwarming moment between strangers, or a wild dance challenge, viral videos are what keep the internet fun and alive.

🎬 Get Free Netflix Logins

Leave a Reply Cancel reply

Adblock Detected