*Result*: Can artificial intelligence chatbots think like dentists? A comparative analysis based on dental specialty examination questions in restorative dentistry.
J Med Internet Res. 2025 May 27;27:e68538. (PMID: 40424023)
Cureus. 2025 Jan 11;17(1):e77292. (PMID: 39801704)
BMC Oral Health. 2025 Apr 18;25(1):592. (PMID: 40251567)
Int J Dent. 2021 Feb 08;2021:6674213. (PMID: 33628248)
J Clin Med. 2025 Mar 14;14(6):. (PMID: 40142791)
Radiology. 2024 Jan;310(1):e232756. (PMID: 38226883)
J Prim Care Community Health. 2024 Jan-Dec;15:21501319241245847. (PMID: 38605668)
NPJ Digit Med. 2025 Mar 22;8(1):175. (PMID: 40121370)
Imaging Sci Dent. 2024 Sep;54(3):271-275. (PMID: 39371301)
BMJ Neurol Open. 2023 Jun 15;5(1):e000451. (PMID: 37337531)
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. (PMID: 36812645)
Sci Rep. 2022 Feb 17;12(1):2726. (PMID: 35177653)
Int J Oral Sci. 2023 Jul 28;15(1):29. (PMID: 37507396)
BMC Oral Health. 2025 Feb 01;25(1):173. (PMID: 39893407)
Clin Oral Investig. 2024 Oct 7;28(11):575. (PMID: 39373739)
Dent J (Basel). 2025 Jun 21;13(7):. (PMID: 40710124)
Int Orthop. 2024 Aug;48(8):1963-1969. (PMID: 38619565)
J Dent Sci. 2025 Oct;20(4):2307-2314. (PMID: 41040620)
J Dent Sci. 2025 Jul;20(3):1454-1459. (PMID: 40654425)
J Dent. 2024 May;144:104938. (PMID: 38499280)
Narra J. 2023 Apr;3(1):e103. (PMID: 38450035)
BMC Oral Health. 2025 Apr 15;25(1):574. (PMID: 40234820)
BMC Oral Health. 2025 Apr 15;25(1):573. (PMID: 40234873)
JMIR Med Educ. 2023 Dec 5;9:e49183. (PMID: 38051578)
Eur J Dent Educ. 2024 Feb;28(1):206-211. (PMID: 37550893)
BMC Med Educ. 2023 Sep 22;23(1):689. (PMID: 37740191)
BMC Med Educ. 2025 Feb 27;25(1):321. (PMID: 40016760)
BMC Med Educ. 2024 Jun 26;24(1):694. (PMID: 38926809)
BMC Med Educ. 2025 Feb 10;25(1):214. (PMID: 39930399)
J Educ Eval Health Prof. 2021;18:13. (PMID: 34182619)
Sci Rep. 2024 Jul 25;14(1):17118. (PMID: 39054346)
*Further Information*
*Background: The integration of artificial intelligence (AI) in healthcare and medical education has advanced rapidly, with conversational AI systems gaining attention for their potential in academic assessment and clinical reasoning. This study aimed to evaluate AI chatbots' performance on restorative dentistry questions from the Turkish Dental Specialty Examination (DUS), a high-stakes national exam assessing theoretical and clinical knowledge.
Methods: An in silico, cross-sectional, comparative design was employed. A total of 190 multiple-choice questions (MCQs) from 19 DUS sessions between 2012 and 2025 were obtained from the Assessment, Selection, and Placement Center (ÖSYM) website. After excluding annulled items, 188 questions were analyzed. Eight AI chatbots (ChatGPT-3.5, ChatGPT-4o Free, ChatGPT-4o Plus, Claude Sonnet 4, Microsoft Copilot, DeepSeek, Gemini 1.5, and Gemini Advanced) were tested using a standardized single-attempt protocol in Turkish. Performance measures included accuracy, response length, and response time. Questions were categorized by year, content domain, and length for subgroup analyses. Statistical analyses were conducted in Python using standard libraries. Descriptive statistics and Pearson's correlation were calculated, while comparisons involved the Shapiro-Wilk test, Levene's test, Kruskal-Wallis test, and Dunn's post hoc test, with significance set at p < 0.05.
Results: No significant difference was found in overall accuracy (p = 0.18). However, response time and word count differed significantly (p < 0.001). Gemini Advanced showed the highest accuracy (96.28%), followed by ChatGPT-4o Plus (93.62%). Gemini 1.5 produced the longest yet fastest responses, while DeepSeek had the lowest accuracy and slowest responses. Accuracy remained stable across years but varied by topic, with lower performance in complex areas such as cavity preparation. In case-based questions, Gemini Advanced, Gemini 1.5, and ChatGPT-4o Plus achieved 100% accuracy. Performance in image-based questions was inconsistent, underscoring limitations in visual reasoning.
Conclusions: AI chatbots demonstrated high accuracy in answering restorative dentistry exam questions, with Gemini Advanced, ChatGPT-4o Plus, and Gemini 1.5 showing superior performance. Despite differences in response time and content length, their potential as supplementary tools in dental education is evident, warranting further validation across specialties and contexts.
Trial Registration: Not applicable.
(© 2026. The Author(s).)*
*Declarations. Ethics approval and consent to participate: Not applicable. Open-source public data was used in this study. Consent for publication: Not applicable since there was no direct human contact. Competing interests: The authors declare no competing interests.*