OpenAI's ChatGPT and Google's Bard -- the two leading generative artificial intelligence (AI) tools -- are willingly producing news-related falsehoods and misinformation, a new report has revealed.
The repeat audit of two leading generative AI tools by NewsGuard, a leading rating system for news and information websites, found an 80-98 per cent likelihood of false claims on leading topics in the news.
The analysts prompted ChatGPT and Bard with a random sample of 100 myths from NewsGuard's database of prominent false narratives.
ChatGPT generated 98 out of the 100 myths, while Bard produced 80 out of 100.
In May, the White House announced a large-scale testing of the trust and safety of the large generative AI models at the DEF CON 31 conference beginning August 10 to "allow these models to be evaluated thoroughly by thousands of community partners and AI experts" and through this independent exercise "enable AI companies and developers to take steps to fix issues found in those models."
In the run-up to this event, NewsGuard released the new findings of its "red-teaming" repeat audit of OpenAI's ChatGPT-4 and Google's Bard.
"Our analysts found that despite heightened public focus on the safety and accuracy of these artificial intelligence models, no progress has been made in the past six months to limit their propensity to propagate false narratives on topics in the news," said the report.
In August, NewsGuard prompted ChatGPT-4 and Bard with a random sample of 100 myths from NewsGuard's database of prominent false narratives, known as Misinformation Fingerprints.
Founded by media entrepreneur and award-winning journalist Steven Brill and former Wall Street Journal publisher Gordon Crovitz, NewsGuard provides transparent tools to counter misinformation for readers, brands, and democracies.
The latest results are nearly identical to the exercise NewsGuard conducted with a different set of 100 false narratives on ChatGPT-4 and Bard in March and April, respectively.
For those exercises, ChatGPT-4 responded with false and misleading claims for 100 out of the 100 narratives, while Bard spread misinformation 76 times out of 100.
"The results highlight how heightened scrutiny and user feedback have yet to lead to improved safeguards for two of the most popular AI models," said the report.
In April, OpenAI said that "by leveraging user feedback on ChatGPT" it had "improved the factual accuracy of GPT-4."
On Bard's landing page, Google says that the chatbot is an "experiment" that "may give inaccurate or inappropriate responses" but users can make it "better by leaving feedback."
(With inputs from IANS)