J Thorac Cardiovasc Surg in press

Evaluating ChatGPT as a Patient Resource for Frequently
Asked Questions about Lung Cancer Surgery - A Pilot Study.

Ferrari-Light D, Merritt RE, D'Souza D, Ferguson MK, Harrison S, Madariaga ML, Lee BE, Moffatt-Bruce SD, Kneuertz PJ

OBJECTIVE : Chat-based artificial intelligence (AI) programs like ChatGPT are re-imagining how patients seek information. This study aims to evaluate the quality and accuracy of ChatGPT-generated answers to common patient questions about lung cancer surgery.

METHODS : A 30-question survey of patient questions about lung cancer surgery was posed to ChatGPT in July 2023. The ChatGPT-generated responses were presented to nine thoracic surgeons at four academic institutions who rated the quality of the answer on a 5-point Likert scale. They also evaluated if the response contained any inaccuracies and were prompted to submit free text comments. Responses were analyzed in aggregate.

RESULTS : For ChatGPT-generated answers, the average quality ranged from 3.1-4.2 out of 5.0, indicating they were generally "good" or "very good". No answer received a unanimous 1-star (poor quality) or 5-star (excellent quality) score. Minor inaccuracies were found by at least one surgeon in 100% of the answers, and major inaccuracies were found in 36.6%. Regarding ChatGPT, 66.7% of surgeons felt it was an accurate source of information for patients. However, only 55.6% felt they were comparable to answers given by experienced thoracic surgeons, and only 44.4% would recommend it to their patients. Common criticisms of ChatGPT-generated answers included lengthiness, lack of specificity regarding surgical care, and lack of references.

CONCLUSIONS : Chat-based AI programs have potential to become a useful information tool for lung cancer surgery patients. However, the quality and accuracy of ChatGPT-generated answers need improvement before thoracic surgeons could consider this method as a primary education source for patients.