*Result*: Human vs. artificial intelligence: Physicians outperform ChatGPT in real-world pharmacotherapy counselling.
Original Publication: London, Macmillan Journals Ltd.
Gabay MP. The evolution of drug information centers and specialists. Hosp Pharm. 2017;52(7):452‐453. doi:10.1177/0018578717724235.
Heck J, Stichtenoth DO, Sabau R, et al. Clinical‐pharmacological drug information center of Hannover Medical School: experiences and analysis from a tertiary care university hospital. Sci Rep. 2022;12(1):19409. doi:10.1038/s41598‐022‐24005‐y.
OpenAI. Introducing ChatGPT. 2022. Available at: https://openai.com/index/chatgpt/. Accessed May 26, 2025.
Ouyang L, Wu J, Jiang X, et al. Training language models to follow instructions with human feedback. NeurIPS. 2022;35:27730‐27744.
Sallam M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Health. 2023;11(6):887. doi:10.3390/healthcare11060887.
Lee P, Bubeck S, Petro J. Benefits, limits, and risks of GPT‐4 as an AI chatbot for medicine. N Engl J Med. 2023;388(13):1233‐1239. doi:10.1056/NEJMsr2214184.
Wehkamp K, Krawczak M, Schreiber S. The quality and utility of artificial intelligence in patient care. Dtsch Arztebl Int. 2023;120(27–28):463‐469. doi:10.3238/arztebl.m2023.0124.
Haenssle HA, Fink C, Toberer F, et al. Man against machine reloaded: performance of a market‐approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions. Ann Oncol. 2020;31(1):137‐143. doi:10.1016/j.annonc.2019.10.013.
Romero‐Martín S, Elías‐Cabot E, Raya‐Povedano JL, Gubern‐Mérida A, Rodríguez‐Ruiz A, Álvarez‐Benito M. Stand‐alone use of artificial intelligence for digital mammography and digital breast tomosynthesis screening: a retrospective evaluation. Radiology. 2022;302(3):535‐542. doi:10.1148/radiol.211590.
Chamberlin J, Kocher MR, Waltz J, et al. Automated detection of lung nodules and coronary artery calcium using artificial intelligence on low‐dose CT scans for lung cancer screening: accuracy and prognostic value. BMC Med. 2021;19(1):55. doi:10.1186/s12916‐021‐01928‐3.
Jung LB, Gudera JA, Wiegand TLT, Allmendinger S, Dimitriadis K, Koerte IK. ChatGPT passes German state examination in medicine with picture questions omitted. Dtsch Arztebl Int. 2023;120(21):373‐374. doi:10.3238/arztebl.m2023.0113.
Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI‐assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. doi:10.1371/journal.pdig.0000198.
Choi W. Assessment of the capacity of ChatGPT as a self‐learning tool in medical pharmacology: a study using MCQs. BMC Med Educ. 2023;23(1):864. doi:10.1186/s12909‐023‐04832‐x.
Montastruc F, Storck W, de Canecaude C, et al. Will artificial intelligence chatbots replace clinical pharmacologists? An exploratory study in clinical practice. Eur J Clin Pharmacol. 2023;79(10):1375‐1384. doi:10.1007/s00228‐023‐03547‐8.
Ryan DK, Maclean RH, Balston A, Scourfield A, Shah AD, Ross J. Artificial intelligence and machine learning for clinical pharmacology. Br J Clin Pharmacol. 2024;90(3):629‐639. doi:10.1111/bcp.15930.
Shahin MH, Barth A, Podichetty JT, et al. Artificial intelligence: from buzzword to useful tool in clinical pharmacology. Clin Pharmacol Ther. 2024;115(4):698‐709. doi:10.1002/cpt.3083.
ChatGPT zum thema PPI‐langzeittherapie: Ist künstliche intelligenz eine quelle für unabhängige informationen zu arzneimitteln? Der Arzneimittelbrief. 2023;57(7):49‐51.
Arzneimittelbrief. Long‐term PPI therapy: assessment. 2023. Available at: https://chat.openai.com/share/51012c1a-8ca9-4a4d-a0bb-f731a39c4860. Accessed May 28, 2024.
Han K, Cao P, Wang Y, et al. A review of approaches for predicting drug‐drug interactions based on machine learning. Front Pharmacol. 2022;12:814858. doi:10.3389/fphar.2021.814858.
Mosshammer D, Haumann H, Morike K, Joos S. Polypharmacy—an upward trend with unpredictable effects. Dtsch Arztebl Int. 2016;113(38):627‐633. doi:10.3238/arztebl.2016.0627.
Li R, Kumar A, Chen JH. How chatbots and large language model artificial intelligence systems will reshape modern medicine: fountain of creativity or Pandora's box? JAMA Intern Med. 2023;183(6):596‐597. doi:10.1001/jamainternmed.2023.1835.
Grossman S, Zerilli T, Nathan JP. Appropriateness of ChatGPT as a resource for medication‐related questions. Br J Clin Pharmacol. 2024;90(10):2691‐2695. doi:10.1111/bcp.16212.
Schlicker N, Langer M, Hirsch MC. How trustworthy is artificial intelligence?: a model for the conflict between objectivity and subjectivity. Inn Med (Heidelb). 2023;64(11):1051‐1057. doi:10.1007/s00108‐023‐01602‐1.
Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183(6):589‐596. doi:10.1001/jamainternmed.2023.1838.
Faul F, Erdfelder E, Lang A, Buchner A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39(2):175‐191. doi:10.3758/bf03193146.
Faul F, Erdfelder E, Buchner A, Lang A. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behav Res Methods. 2009;41(4):1149‐1160. doi:10.3758/BRM.41.4.1149.
Morath B, Chiriac U, Jaszkowski E, et al. Performance and risks of ChatGPT used in drug information: an exploratory real‐world analysis. Eur J Hosp Pharm. 2023;31(6):491‐497.
Hayes AF, Krippendorff K. Answering the call for a standard reliability measure for coding data. Commun Methods Meas. 2007;1(1):77‐89. doi:10.1080/19312450709336664.
Krippendorff K. Content Analysis: An Introduction to Its Methodology. 2nd ed. Sage Publications; 2004.
Brown T, Mann B, Ryder N, et al. Language models are few‐shot learners. NeurIPS. 2020;33:1877‐1901.
Seghier ML. ChatGPT: not all languages are equal. Nature. 2023;615(7951):216. doi:10.1038/d41586‐023‐00680‐3.
Dobbrunz S, Brunner F, Müller JL, Briken P. Interrater‐reliabilität der kriteriengeleiteten beurteilung der schuldfähigkeit bei paraphilen störungen. Nervenarzt. 2021;92(1):1‐8. doi:10.1007/s00115‐020‐00920‐1.
Zernikow J, Grassow L, Gröschel J, Henrion P, Wetzel PJ, Spethmann S. Clinical application of large language models: does ChatGPT replace medical report formulation? An experience report. Inn Med (Heidelb). 2023;64(11):1058‐1064. doi:10.1007/s00108‐023‐01600‐3.
Goodman RS, Patrinely JR, Stone CAJ, et al. Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open. 2023;6(10):e2336483. doi:10.1001/jamanetworkopen.2023.36483.
Bischof T, Al Jalali V, Zeitlinger M, et al. Chat GPT vs. clinical decision support systems in the analysis of drug‐drug interactions. Clin Pharmacol Ther. 2025;117(4):1142‐1147. doi:10.1002/cpt.3585.
Helgestad OK, Hjelholt AJ, Vestergaard SV, Azuz S, Sædder EA, Overvad TF. ChatGPT versus physician‐derived answers to drug‐related questions. Dan Med J. 2024;72(1):A05240360. doi:10.61409/A05240360.
Huang X, Estau D, Liu X, Yu Y, Qin J, Li Z. Evaluating the performance of ChatGPT in clinical pharmacy: a comparative study of ChatGPT and clinical pharmacists. Br J Clin Pharmacol. 2024;90(1):232‐238. doi:10.1111/bcp.15896.
Al‐Dujaili Z, Omari S, Pillai J, Al Faraj A. Assessing the accuracy and consistency of ChatGPT in clinical pharmacy management: a preliminary analysis with clinical pharmacy experts worldwide. Res Social Adm Pharm. 2023;19(12):1590‐1594. doi:10.1016/j.sapharm.2023.08.012.
Fournier A, Fallet C, Sadeghipour F, Perrottet N. Assessing the applicability and appropriateness of ChatGPT in answering clinical pharmacy questions. Ann Pharm Fr. 2024;82(3):507‐513. doi:10.1016/j.pharma.2023.11.001.
Roosan D, Padua P, Khan R, Khan H, Verzosa C, Wu Y. Effectiveness of ChatGPT in clinical pharmacy and the role of artificial intelligence in medication therapy management. J Am Pharm Assoc. 2024;64(2):422‐428.
Munir F, Gehres A, Wai D, Song L. Evaluation of ChatGPT as a tool for answering clinical questions in pharmacy practice. J Pharm Pract. 2024;37(6):1303‐1310. doi:10.1177/08971900241256731.
van Nuland M, Lobbezoo AH, van de Garde EMW, et al. Assessing accuracy of ChatGPT in response to questions from day to day pharmaceutical care in hospitals. Explor Res Clin Soc Pharm. 2024;15:100464.
van Nuland M, Erdogan A, Aςar C, et al. Performance of ChatGPT on factual knowledge questions regarding clinical pharmacy. J Clin Pharmacol. 2024;64(9):1095‐1100. doi:10.1002/jcph.2443.
Kataoka Y, So R. Benefits, limits, and risks of GPT‐4 as an AI Chatbot for medicine. N Engl J Med. 2023;388(25):2399. doi:10.1056/NEJMc2305286.
Haftenberger A, Dierks C. Legal integration of artificial intelligence into internal medicine: data protection, regulatory, reimbursement and liability questions. Inn Med (Heidelb). 2023;64(11):1044‐1050. doi:10.1007/s00108‐023‐01598‐8.
Reis M, Reis F, Kunde W. Influence of believed AI involvement on the perception of digital medical advice. Nat Med. 2024;30(11):3098‐3100. doi:10.1038/s41591‐024‐03180‐7.
OpenAI. Introducing GPT‐5. 2025. Available at: https://openai.com/index/introducing-gpt-5/. Accessed August 14, 2025.
Capoot A. OpenAI launches new GPT‐5 model for all ChatGPT users. 2025. Available at: https://www.cnbc.com/2025/08/07/openai-launches-gpt-5-model-for-all-chatgpt-users.html. Accessed August 14, 2025.
Google DeepMind. Google Gemini. Available at: https://gemini.google.com/app. Accessed August 14, 2025.
Meta AI. Llama 4. Available at: https://www.llama.com/. Accessed August 14, 2025.
Haug CJ, Drazen JM. Artificial intelligence and machine learning in clinical medicine, 2023. N Engl J Med. 2023;388(13):1201‐1208. doi:10.1056/NEJMra2302038.
Krumborg JR, Mikkelsen N, Damkier P, et al. ChatGPT: first glance from a perspective of clinical pharmacology. Basic Clin Pharmacol Toxicol. 2023;133(1):3‐5. doi:10.1111/bcpt.13879.
*Further Information*
*Aims: To assess the utility of the artificial intelligence (AI) chatbot ChatGPT (openly available version 3.5) in responding to real-world pharmacotherapeutic queries from healthcare professionals.
Methods: Three independent and blinded evaluators with different levels of medical expertise and professional experience (beginner, advanced, and expert) compared AI chatbot- and physician-generated responses to 70 real-world pharmacotherapeutic queries submitted to the clinical-pharmacological drug information centre of Hannover Medical School between June and October 2023 with regard to quality of information, answer preference, answer correctness and quality of language. Inter-rater reliability was assessed with Krippendorff's alpha. Two separate investigators not otherwise involved in the conduct or analysis of the study selected the top three clinically relevant errors in chatbot- and physician-generated responses.
Results: All three evaluators rated the quality of information of physician-generated responses higher than the quality of information of AI chatbot-generated responses and, accordingly, thought that the physician-generated responses were better than the chatbot-generated responses (answer preference). All evaluators detected factually wrong information more frequently in chatbot-generated responses than in physician-generated responses. Although the beginner and expert evaluators rated the quality of language of physician-generated responses higher than the quality of language of chatbot-generated responses, there was no significant difference according to the advanced evaluator.
Conclusions: ChatGPT's responses to real-world pharmacotherapeutic queries were substantially inferior compared to conventional physician-generated responses with regard to quality of information and factual correctness. Our study suggests that to date it must be strongly cautioned against the use of ChatGPT in pharmacotherapy counselling.
(© 2025 The Author(s). British Journal of Clinical Pharmacology published by John Wiley & Sons Ltd on behalf of British Pharmacological Society.)*