Ophthalmologists vs chatbots

October 19, 2023 Staff reporters

A US study found ChatGPT responded to patient eyecare questions with answers which did not significantly differ from those of ophthalmologists, while a Canadian study showed ChatGPT’s ophthalmology exam performance leapt from 46% to 84% accuracy within six months.

 

Writing in JAMA Network Open, Stanford University’s Isaac Bernstein and colleagues said they took questions and answers from the American Academy of Ophthalmology-affiliated Eye Care Forum. Of these, 200 question-answer pairs were compared with responses from AI-driven large language model (LLM) ChatGPT.

 

Eight ophthalmologists reviewed the answers to see if they could distinguish which were from real doctors and which were ChatGPT’s, resulting in a mean accuracy of 61%. The research team said the likelihood of chatbot answers containing incorrect or inappropriate material was also comparable with human answers, concluding the chatbot could generate surprisingly coherent and correct answers to many ophthalmology questions, some of which were quite detailed and specialised. “Our results suggest that LLMs can provide appropriate ophthalmic advice to common patient questions of varying complexity,” they wrote.

 

In a separate study, published in JAMA Ophthalmology, Toronto and Western Ontario researchers found ChatGPT answered multiple-choice questions on OphthoQuestions, commonly used in preparation for board-certification exams, with 84% accuracy. The study expanded previous research which showed an earlier version of the chatbot answered just 46% of the multiple-choice questions correctly in January 2023 and 58% in February 2023.