
Unraveling the Accuracy Crisis of ChatGPT's Citations
Generative AI's Impact on Publishers
The study by the Tow Center at Columbia Journalism School shows that publishers remain at the mercy of generative AI tools like ChatGPT. Even when they allow OpenAI to crawl their content, there is a tendency for the tool to invent or misrepresent information. For example, when asked to identify the source of sample quotations from a mix of publishers, ChatGPT often failed to provide accurate citations. Some quotes were sourced from publishers that had blocked OpenAI's search crawlers, yet the bot still attempted to generate citations, often falling back on confabulation. This highlights the lack of transparency in ChatGPT's responses and the difficulty for users to assess the validity of claims.Moreover, the study found that no publisher was spared from inaccurate representations of its content in ChatGPT. There were instances of entirely correct citations, but also many where the bot was completely wrong. This unreliable mixed bag of citations poses a significant risk to publishers, both in terms of reputation and commercial interests.
Plagiarism and Data Quality
ChatGPT could potentially be rewarding plagiarism, as evidenced by an instance where it erroneously cited a website that had plagiarized a New York Times story. This raises serious questions about OpenAI's ability to filter and validate data sources, especially when dealing with unlicensed or plagiarized content. Even for publishers that have inked deals with OpenAI, the study found that ChatGPT's citations were not always reliable. The fundamental issue seems to be that OpenAI's technology treats journalism as decontextualized content, with little regard for the original production circumstances.In addition, the variation in ChatGPT's responses was another concern. When asked the same query multiple times, it typically returned a different answer each time. This inconsistency is suboptimal in a citation context where accuracy is crucial.
The Small-Scale Study's Significance
Although the Tow study is small-scale and more rigorous testing is needed, it is still notable given the high-level deals that major publishers are making with OpenAI. If media businesses were hoping for special treatment in terms of accurate sourcing, this study suggests that OpenAI has yet to offer such consistency. Publishers that have not blocked OpenAI's crawlers also face risks, as citations may not be accurate. There is no guaranteed "visibility" for publishers even when they allow crawlers in, and completely blocking crawlers does not necessarily save them from reputational damage.In conclusion, publishers currently have little meaningful agency over what happens to their content when it interacts with ChatGPT. This study serves as a wake-up call for the industry to address these issues and find ways to ensure the accuracy and integrity of citations in the age of generative AI.
