In a study of two AI chatbot programs, ChatGPT and ChatGPT 3.5, it was found that while they generated average-quality ophthalmic scientific abstracts, they also produced a significant amount of fake references. Around 30% of the references were either fake or unverifiable, although they closely resembled the form of real information. Both chatbots struggled with nuanced decision-making questions about ophthalmology. The study emphasizes the limitations of current AI chatbot technology and calls for content generated by chatbots to be reviewed and verified.
Despite the fake references, both chatbots were found to generate average-quality abstracts. An accompanying editorial highlights the need for careful consideration of ethics and the recognition that chatbots lack critical thinking and understanding. It is suggested that chatbots be added to the list of tools available to authors for the analysis and dissemination of information. The study also mentions the ChatGPT 3.5 AI chatbot, which has revolutionized interactions with technology since its launch in late 2022. That chatbot continually improves its responses through feedback from humans. Also, the study discusses the tendency of chatbots both to generate factual errors and to ‘hallucinate’ (i.e. fabricate misinformation), which are estimated to be around 20-25%.
The researchers evaluated the quality of the chatbot-generated ophthalmic scientific abstracts and the accuracy of two different AI output detectors. The study found that both versions of the chatbots had similar abstract quality scores and hallucination rates. The performance of the AI output detectors varied with the GPT-2 detector producing higher scores for abstracts than ChatGPT 3.5; meaning that the writing style of the older ChatGPT is easier to detect as artificial. The study concludes by emphasizing the need for close attention to scientific study quality and publishing ethics in the use of generative AI.
The whytry.ai article you just read is a brief synopsis; the original article can be found here: Read the Full Article…