Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine (NEJM - Correspondence)

TO THE EDITOR (Part 1)

With regard to the special report by Lee et al. (March 30 issue), we have concerns about the reproducibility of the results and the need for proper reporting of the parameters used in the GPT-4 (Generative Pretrained Transformer 4) artificial intelligence (AI) model. We attempted to reproduce the results using ChatGPT (March 23 version) with GPT-4, and we could not reproduce the conversations. Do the authors have experience with the reproducibility of GPT-4 outputs when it is given the same query at different times? Is the corpus of learned knowledge changing over time? What specific parameters should users specify if they want reproducible outputs from ChatGPT?

Reproducibility is a cornerstone of good medical practice; will GPT-4 be able to deliver reproducible results? This will be necessary if we are to have reliable AI-driven solutions in medicine.

Yuki Kataoka, M.D., Dr.P.H.
Scientific Research Works Peer Support Group (SRWS-PSG), Osaka, Japan

Ryuhei So, M.D., M.P.H.
Okayama Psychiatric Medical Center, Okayama, Japan



Reference:

1. Lee PBubeck SPetro J. Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N Engl J Med 2023;388:1233-1239.


TO THE EDITOR (Part 2)

In their special report, Lee et al. discuss the potential use of GPT-4 in the medical context, including applications in medical note-taking, innate medical knowledge, and medical consultation. We think that an important use of GPT-4 and related programs will be in medical education. Adapting medical education to this new tool will require changes in both educators and students — for example, the development of personalized learning methods and collaborative environments, as well as discussions about ethical implications related to AI and potential biases.1,2 AI will be a powerful ally in the management of medical knowledge, allowing users to apply accumulated knowledge to the processes of diagnostic and therapeutic reasoning in order to support accurate decision making.3 For this to happen, students need training and familiarity with these systems, starting at the beginning of medical training. Do the authors agree that educators must shift from being transmitters of information to being guides in the development of competencies, skills, and critical thinking and that students must be increasingly active in learning to use these new sources of information responsibly?

Alexandre C. Fernandes, M.D.
Edmond and Lily Safra International Institute of Neuroscience, Macaíba, Brazil

Maria E.V.C. Souto
State University of Rio Grande do Norte, Mossoró, Brazil



References:

1. Char DS, Shah NH, Magnus D. Implementing machine learning in health care — addressing ethical challenges. N Engl J Med 2018;378:981-983.

2. Haupt CE, Marks M. AI-generated medical advice-GPT and beyond. JAMA 2023;329:1349-1350.

3. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44-56.

Response

The authors reply: In response to Kataoka and So: parts of the GPT-4 AI model are constantly evolving, and therefore its outputs change over time. Although it may be technically feasible to “freeze” the system so that it does not change, GPT-4 also uses some elements of randomness when analyzing its inputs and generating its outputs. This means that even if the system were frozen, it would still tend to give different outputs even when prompted with precisely the same inputs. (In this sense, GPT-4 behaves more like a human being than like a software application.) The changes are mainly in the choice of words, the verbosity of responses, and the format of answers (e.g., in narrative form or as a bulleted list). However, GPT-4 can also produce conceptually different or even contradictory answers, especially in situations in which the input question has no specific or known correct answer.

Fernandes and Souto raise the issue of the importance of chatbots in medical education. As we noted in our article, the potential for benefit exists; we hope it is realized in the future.

Peter Lee, Ph.D.
Sebastien Bubeck, Ph.D.
Microsoft Research, Redmond, WA

Joseph Petro, M.S., M.Eng.
Nuance Communications, Burlington, MA

Reposted from: https://www.nejm.org/doi/full/10.1056/NEJMc2305286


Comments

Labels

Show more

Popular posts from this blog

10 Best Natural Ozempic Alternatives 2024

10 Best Vitamin C Serums Recommended by Dermatologists 2024

Can Diet and Lifestyle influence your Risk of getting Cancer? Let the Science Speak (2024)

9 Best Vitamin C Serums for Brighter Skin 2024

10 Best Cosmeceutical Ingredients of 2024

7 Best Vitamin C Serums for Hyperpigmentation 2024

Linoleic Acid vs Linolenic Acid: What's the Difference?

Linoleic Acid vs Oleic Acid: What's the Difference?

7 Best Cetylpyridinium Chloride Mouthwash Brands 2023

Chlormequat: Breakfast Cereals Scrutinized for Pesticide That May Harm Reproduction

Archive

Show more