Evaluation of the Performance of Three Large Language Models in Clinical Decision Support: A Comparative Study Based on Actual Cases

Rajpurkar, P., E. Chen, O. Banerjee, and E.J. Topol, AI in health and medicine. Nat. Med. 28(1):31–38, 2022.

Kulkarni, P.A. and H. Singh, Artificial Intelligence in Clinical Diagnosis: Opportunities, Challenges, and Hype. JAMA. 330(4):317–318, 2023.

Article PubMed Google Scholar

Wojtara, M., E. Rana, T. Rahman, et al., Artificial intelligence in rare disease diagnosis and treatment. Clin. Transl. Sci. 2023. 16(11):2106–2111.

Article PubMed PubMed Central Google Scholar

Pirracchio, R., M.J. Cohen, I. Malenica, et al., Big data and targeted machine learning in action to assist medical decision in the ICU. Anaesth. Crit. Care. Pain. Med. 38(4):377–384, 2019.

Article PubMed Google Scholar

Skryd, A. and K. Lawrence, ChatGPT as a Tool for Medical Education and Clinical Decision-Making on the Wards: Case Study. JMIR Form Res. 8:e51346, 2024.

Article PubMed PubMed Central Google Scholar

Rao, A., M. Pang, J. Kim, et al., Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study. J. Med. Internet. Res. 25:e48659, 2023.

Article PubMed PubMed Central Google Scholar

Sarraju, A., D. Bruemmer, E. Van Iterson, et al., Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model. JAMA. 329(10):842–844, 2023

Article PubMed PubMed Central Google Scholar

Ayers, J.W., A. Poliak, M. Dredze, et al., Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Intern. Med. 183(6):589–596, 2023.

Article PubMed PubMed Central Google Scholar

Liu, S., A.P. Wright, B.L. Patterson, et al., Using AI-generated suggestions from ChatGPT to optimize clinical decision support. J. Am. Med. Inform. Assoc. 30(7):1237–1245, 2023.

Article PubMed PubMed Central Google Scholar

OpenAI. Introducing ChatGPT. November 30, 2022 [cited 2024 March 25]; Available from: https://openai.com/blog/chatgpt.

Achiam, J., S. Adler, S. Agarwal, et al., Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.

Gilson, A., C.W. Safranek, T. Huang, et al., How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med. Educ. 9:e45312, 2023.

Article PubMed PubMed Central Google Scholar

Yanagita, Y., D. Yokokawa, S. Uchida, et al., Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study. JMIR Form. Res. 7:e48023, 2023.

Article PubMed PubMed Central Google Scholar

Wang, X., Z. Gong, G. Wang, et al., ChatGPT Performs on the Chinese National Medical Licensing Examination. J. Med. Syst. 47(1):86, 2023.

Article PubMed Google Scholar

Grünebaum, A., J. Chervenak, S.L. Pollet, et al., The exciting potential for ChatGPT in obstetrics and gynecology. Am. J. Obstet. Gynecol. 228(6):696–705, 2023

Article PubMed Google Scholar

Potapenko, I., L.C. Boberg-Ans, M. Stormly Hansen, et al., Artificial intelligence-based chatbot patient information on common retinal diseases using ChatGPT. Acta. Ophthalmol. 101(7):829–831, 2023.

Article PubMed Google Scholar

Yeo, Y.H., J.S. Samaan, W.H. Ng, et al., Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin. Mol. Hepatol. 29(3):721–732, 2023.

Article PubMed PubMed Central Google Scholar

Singh, S., A. Djalilian, and M.J. Ali., ChatGPT and Ophthalmology: Exploring Its Potential with Discharge Summaries and Operative Notes. Semin. Ophthalmol. 38(5):503–507, 2023.

Article PubMed Google Scholar

Zhou, Z., Evaluation of ChatGPT’s Capabilities in Medical Report Generation. Cureus. 15(4):e37589, 2023.

PubMed PubMed Central Google Scholar

Google. Welcome to the Gemini era. [cited 2024 March 29]; Available from: https://deepmind.google/technologies/gemini/#introduction.

Med-Go. Go For Changes. [cited 2024 March 25]; Available from: https://www.med-go.cn/.

Cascella, M., J. Montomoli, V. Bellini, and E. Bignami., Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios. J. Med. Syst. 47(1):33, 2023.

Article PubMed PubMed Central Google Scholar

Wilhelm, T.I., J. Roos, and R. Kaczmarczyk., Large Language Models for Therapy Recommendations Across 3 Clinical Specialties: Comparative Study. J. Med. Internet. Res. 25:e49324, 2023.

Article PubMed PubMed Central Google Scholar

Giannakopoulos, K., A. Kavadella, A. Aaqel Salim, et al., Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study. J. Med. Internet. Res. 25:e51580, 2023.

Article PubMed PubMed Central Google Scholar

Zhang, H. and B. An, MedGo: A Chinese Medical Large Language Model. arxiv:2410.20428[cs.CL,cs.AI].

Kojima, T., S.S. Gu, M. Reid, et al., Large language models are zero-shot reasoners. Nips ‘22, 2024.

View original article

JOURNAL OF MEDICAL SYSTEMS

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Evaluation of the Performance of Three Large Language Models in Clinical Decision Support: A Comparative Study Based on Actual Cases

Comments (0)