*Result*: Generative Artificial Intelligence Methodology Reporting in Otolaryngology: A Scoping Review.
Original Publication: St. Louis, Mo. : [s.n., 1896-
Sci Rep. 2024 Jun 19;14(1):14156. (PMID: 38898116)
Front Artif Intell. 2025 Jan 14;7:1493716. (PMID: 39877751)
Commun Med (Lond). 2023 Oct 10;3(1):141. (PMID: 37816837)
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:478-487. (PMID: 38827053)
Health Aff (Millwood). 2025 Jan;44(1):90-98. (PMID: 39761454)
NPJ Digit Med. 2024 Oct 2;7(1):271. (PMID: 39358556)
Ann Intern Med. 2018 Oct 2;169(7):467-473. (PMID: 30178033)
NPJ Digit Med. 2024 Feb 20;7(1):41. (PMID: 38378899)
NPJ Digit Med. 2022 Dec 26;5(1):194. (PMID: 36572766)
Laryngoscope. 2022 Sep;132(9):1698-1700. (PMID: 35748581)
JAMA. 2025 Jan 28;333(4):319-328. (PMID: 39405325)
J Epidemiol Community Health. 1999 Feb;53(2):105-11. (PMID: 10396471)
Patient Educ Couns. 2014 Sep;96(3):395-403. (PMID: 24973195)
NPJ Digit Med. 2025 May 13;8(1):274. (PMID: 40360677)
Nat Med. 2025 Jan;31(1):60-69. (PMID: 39779929)
BMC Med Inform Decis Mak. 2025 Mar 07;25(1):117. (PMID: 40055694)
JMIR Med Inform. 2024 Apr 8;12:e55318. (PMID: 38587879)
Br J Ophthalmol. 2024 Sep 20;108(10):1371-1378. (PMID: 37923374)
Nature. 2023 Aug;620(7972):172-180. (PMID: 37438534)
Digit Health. 2025 Mar 2;11:20552076251324444. (PMID: 40035041)
Otolaryngol Head Neck Surg. 2024 Sep;171(3):667-677. (PMID: 38716790)
Nat Med. 2024 Apr;30(4):1134-1142. (PMID: 38413730)
J Med Internet Res. 2023 Oct 4;25:e50638. (PMID: 37792434)
J Med Internet Res. 2025 Mar 18;27:e70481. (PMID: 40100270)
Front Med (Lausanne). 2024 Oct 29;11:1477898. (PMID: 39534227)
*Further Information*
*Objective: Researchers in otolaryngology-head and neck surgery (OHNS) have sought to explore the potential of large language models (LLMs), but many publications do not include crucial information, such as prompting approach and model parameters. This has substantial implications for reproducibility, since LLMs can generate different output based on differences in "prompt engineering." We aimed to critically review methodological reporting and quality of LLM-focused literature in OHNS.
Data Sources: Databases were searched in October 2024, including PubMed, Embase, Web of Science, ISCA Archive, IEEE Xplore, arXiv, medRxiv, and engRxiv.
Review Methods: Abstract and full text review, as well as data extraction, were performed by two independent reviewers. All primary studies using LLMs within OHNS were included.
Results: From 925 abstracts retrieved, 117 were included. All studies used ChatGPT, with a minority (16.2%) including additional LLMs. Only 46.2% published direct quotations of all prompts. While the majority (76.9%) reported the number of prompts, only 6.8% rationalized this number, while 23.9% reported the number of runs per prompt. Most publications (73.5%) provided some description of prompt development, though only 11.1% explicitly described why specific decisions in prompt design were made, and only 6.0% reported prompt testing. There was no evidence that quality of methodology reporting was improving over time.
Conclusion: LLM-focused literature in OHNS, while exploring many potentially fruitful avenues, demonstrates variable completeness in methodological reporting. This severely limits the generalizability of these studies and suggests that best practices could be further disseminated and enforced by researchers and journals.
(© 2025 The American Laryngological, Rhinological and Otological Society, Inc.)*