ChatGPT scores nearly 50 per cent on board certification practice test for ophthalmology, study shows

AI tool scored more than 10 per cent higher one month later

ST. MICHAEL'S HOSPITAL

A study of ChatGPT found the artificial intelligence tool answered less than half of the test questions correctly from a study resource commonly used by physicians when preparing for board certification in ophthalmology.

The study, published in JAMA Ophthalmology and led by St. Michael’s Hospital, a site of Unity Health Toronto, found ChatGPT correctly answered 46 per cent of questions when initially conducted in Jan. 2023. When researchers conducted the same test one month later, ChatGPT scored more than 10 per cent higher.

The potential of AI in medicine and exam preparation has garnered excitement since ChatGPT became publicly available in Nov. 2022. It’s also raising concern for the potential of incorrect information and cheating in academia. ChatGPT is free, available to anyone with an internet connection, and works in a conversational manner.

“ChatGPT may have an increasing role in medical education and clinical practice over time, however it is important to stress the responsible use of such AI systems,” said Dr. Rajeev H. Muni, principal investigator of the study and a researcher at the Li Ka Shing Knowledge Institute at St. Michael’s. “ChatGPT as used in this investigation did not answer sufficient multiple choice questions correctly for it to provide substantial assistance in preparing for board certification at this time.”

Researchers used a dataset of practice multiple choice questions from the free trial of OphthoQuestions, a common resource for board certification exam preparation. To ensure ChatGPT’s responses were not influenced by concurrent conversations, entries or conversations with ChatGPT were cleared prior to inputting each question and a new ChatGPT account was used. Questions that used images and videos were not included because ChatGPT only accepts text input.

Of 125 text-based multiple-choice questions, ChatGPT answered 58 (46 per cent) questions correctly when the study was first conducted in Jan. 2023. Researchers repeated the analysis on ChatGPT in Feb. 2023, and the performance improved to 58 per cent.

“ChatGPT is an artificial intelligence system that has tremendous promise in medical education. Though it provided incorrect answers to board certification questions in ophthalmology about half the time, we anticipate that ChatGPT’s body of knowledge will rapidly evolve,” said Dr. Marko Popovic, a co-author of the study and a resident physician in the Department of Ophthalmology and Vision Sciences at the University of Toronto.

ChatGPT closely matched how trainees answer questions, and selected the same multiple-choice response as the most common answer provided by ophthalmology trainees 44 per cent of the time. ChatGPT selected the multiple-choice response that was least popular among ophthalmology trainees 11 per cent of the time, second least popular 18 per cent of the time, and second most popular 22 per cent of the time.

“ChatGPT performed most accurately on general medicine questions, answering 79 per cent of them correctly. On the other hand, its accuracy was considerably lower on questions for ophthalmology subspecialties. For instance, the chatbot answered 20 per cent of questions correctly on oculoplastics and zero per cent correctly from the subspecialty of retina. The accuracy of ChatGPT will likely improve most in niche subspecialties in the future,” said Andrew Mihalache, lead author of the study and undergraduate student at Western University.

JOURNAL

JAMA Ophthalmology

DOI

10.1001/jamaopthalmol.2023.1144

SUBJECT OF RESEARCH

Not applicable

ARTICLE TITLE

Performance of an Artificial Intelligence Chatbot for Ophthalmic Knowledge Assessment

ARTICLE PUBLICATION DATE

27-Apr-2023

COI STATEMENT

Dr. Popovic reported grants (to his institution) from PSI Foundation and Fighting Blindness Canada outside the submitted work. Dr. Muni reported serving on the advisory board for Alcon, Bausch and Lomb, Bayer, Noartis, Allergan, and Roche and receiving financial support (to his institution) from Bayer, Novartis, and Roche outside the submitted work. No other disclosures were reported.

Performance of an artificial intelligence chatbot in ophthalmic knowledge assessment

JAMA Ophthalmology

Peer-Reviewed Publication

JAMA NETWORK

About The Study: In this study that included 125 text-based multiple-choice questions provided by the OphthoQuestions free trial for ophthalmic board certification examination preparation, ChatGPT answered approximately half of the questions correctly. Medical professionals and trainees should appreciate the advances of AI in medicine while acknowledging that ChatGPT as used in this investigation did not answer sufficient multiple-choice questions correctly for it to provide substantial assistance in preparing for board certification at this time.

Authors: Rajeev H. Muni, M.D., M.Sc., of St. Michael’s Hospital/Unity Health Toronto in Toronto, is the corresponding author.

(doi:10.1001/jamaophthalmol.2023.1144)

Editor’s Note: Please see the article for additional information, including other authors, author contributions and affiliations, conflict of interest and financial disclosures, and funding and support.

# # #

This link will be live at the embargo time

https://jamanetwork.com/journals/jamaophthalmology/fullarticle/10.1001/jamaophthalmol.2023.1144?guestAccessKey=4b0f74f1-b680-4e68-87a8-f0d463840b9d&utm_source=For_The_Media&utm_medium=referral&utm_campaign=ftm_links&utm_content=tfl&utm_term=042723

JOURNAL

JAMA Ophthalmology

Comparison between ChatGPT and Google search as sources of postoperative patient instructions

JAMA Otolaryngology–Head & Neck Surgery

Peer-Reviewed Publication

JAMA NETWORK

About The Study: The findings of this study suggest that ChatGPT provides postoperative instructions that are helpful for patients with a fifth-grade reading level or different health literacy levels. However, ChatGPT generated instructions scored lower in understandability, actionability, and procedure-specific content than Google Search– and institution-specific instructions.

Authors: Noel Ayoub, M.D., M.B.A., of the Stanford University School of Medicine in Stanford, California, is the corresponding author.

To access the embargoed study: Visit our For The Media website at this link https://media.jamanetwork.com/

(doi:10.1001/jamaoto.2023.0704)

# # #

This link will be live at the embargo time

https://jamanetwork.com/journals/jamaotolaryngology/fullarticle/10.1001/jamaoto.2023.0704?guestAccessKey=3a3ad94a-6a60-44c3-8f1b-68e47e9b5026&utm_source=For_The_Media&utm_medium=referral&utm_campaign=ftm_links&utm_content=tfl&utm_term=042723

LA REVUE GAUCHE - Left Comment

Friday, April 28, 2023

JOURNAL

DOI

SUBJECT OF RESEARCH

ARTICLE TITLE

ARTICLE PUBLICATION DATE

COI STATEMENT

Performance of an artificial intelligence chatbot in ophthalmic knowledge assessment

JOURNAL

Comparison between ChatGPT and Google search as sources of postoperative patient instructions

No comments:

Post a Comment