*Result*: Natural language processing and LLMs in liver imaging: a practical review of clinical applications.
Vernuccio F, Cannella R, Bartolotta TV, Galia M, Tang A, Brancatelli G. Advances in liver US, CT, and MRI: moving toward the future. Eur Radiol Exp 2021;5:52. (PMID: 10.1186/s41747-021-00250-0348736338648935)
Fowler KJ, Bashir MR, Fetzer DT, Kitao A, Lee JM, Jiang H, et al. Universal liver imaging lexicon: imaging atlas for research and clinical practice. Radiographics 2022;43:e220066. (PMID: 10.1148/rg.220066)
Ginès P, Castera L, Lammert F, Graupera I, Serra-Burriel M, Allen AM, et al. Population screening for liver fibrosis: toward early diagnosis and intervention for chronic liver diseases. Hepatology 2022;75:219–28. (PMID: 10.1002/hep.3216334537988)
Lee B, Whitehead MT. Radiology reports: what YOU think you’re saying and what THEY think you’re saying. Curr Probl Diagn Radiol 2017;46:186–95. (PMID: 10.1067/j.cpradiol.2016.11.00528069356)
Sistrom CL, Langlotz CP. A framework for improving radiology reporting. Journal of the American College of Radiology 2005;2:159–67. (PMID: 10.1016/j.jacr.2004.06.01517411786)
López-Úbeda P, Martín-Noguerol T, Escartín J, Luna A. Role of Natural Language Processing in Automatic Detection of Unexpected Findings in Radiology Reports: A Comparative Study of RoBERTa, CNN, and ChatGPT. Acad Radiol 2024. https://doi.org/10.1016/j.acra.2024.07.057 .
López-Úbeda P, Martín-Noguerol T, Escartín J, Luna A. Automatic generation of conclusions from neuroradiology MRI reports through natural language processing. Neuroradiology 2024;66:477–485. (PMID: 10.1007/s00234-024-03312-338381144)
López-Úbeda P, Martín-Noguerol T, Juluru K, Luna A. Natural Language Processing in Radiology: Update on Clinical Applications. Journal of the American College of Radiology 2022;19:1271–85. https://doi.org/10.1016/j.jacr.2022.06.016 . (PMID: 10.1016/j.jacr.2022.06.01636029890)
Le Guellec B, Lefèvre A, Geay C, Shorten L, Bruge C, Hacein-Bey L, et al. Performance of an open-source large language model in extracting information from free-text radiology reports. Radiol Artif Intell 2024;6:e230364. (PMID: 10.1148/ryai.2303643871729211294959)
Stammers M, Ramgopal B, Owusu Nimako A, Vyas A, Nouraei R, Metcalf C, et al. A foundation systematic review of natural language processing applied to gastroenterology & hepatology. BMC Gastroenterol 2025;25:58. (PMID: 10.1186/s12876-025-03608-53991570311800601)
Busch F, Hoffmann L, Dos Santos DP, Makowski MR, Saba L, Prucker P, et al. Large language models for structured reporting in radiology: past, present, and future. Eur Radiol 2024;35:2589–2602. (PMID: 10.1007/s00330-024-11107-63943833012021971)
OpenAI. GPT–4V(ision) System Card 2023.
Huppertz MS, Siepmann R, Topp D, Nikoubashman O, Yüksel C, Kuhl CK, et al. Revolution or risk?—Assessing the potential and challenges of GPT-4V in radiologic image interpretation. Eur Radiol 2025;35:1111–21. (PMID: 10.1007/s00330-024-11115-639422726)
Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies. 2019;1:4171–4186.
Yan A, McAuley J, Lu X, Du J, Chang EY, Gentili A, et al. RadBERT: Adapting transformer-based language models to radiology. Radiol Artif Intell 2022;4:e210258. (PMID: 10.1148/ryai.210258359233769344353)
OpenAI. GPT-4 Technical Report 2023.
Van Vleck TT, Chan L, Coca SG, Craven CK, Do R, Ellis SB, et al. Augmented intelligence with natural language processing applied to electronic health records for identifying patients with non-alcoholic fatty liver disease at risk for disease progression. Int J Med Inform 2019;129:334–41. (PMID: 10.1016/j.ijmedinf.2019.06.028314452756717556)
Matute-González M, Darnell A, Comas-Cufí M, Pazó J, Soler A, Saborido B, et al. Utilizing a domain-specific large language model for LI-RADS v2018 categorization of free-text MRI reports: a feasibility study. Insights Imaging 2024;15:280. (PMID: 10.1186/s13244-024-01850-13957629011584817)
Gu K, Lee JH, Shin J, Hwang JA, Min JH, Jeong WK, et al. Using GPT-4 for LI-RADS feature extraction and categorization with multilingual free-text reports. Liver International 2024;44:1578–87. (PMID: 10.1111/liv.1589138651924)
Liu H, Zhang Z, Xu Y, Wang N, Huang Y, Yang Z, et al. Use of BERT (bidirectional encoder representations from transformers)-based deep learning method for extracting evidences in chinese radiology reports: development of a computer-aided liver cancer diagnosis framework. J Med Internet Res 2021;23:e19689. (PMID: 10.2196/19689334333957837998)
Sada Y, Hou J, Richardson P, El-Serag H, Davila J. Validation of case finding algorithms for hepatocellular cancer from administrative data and electronic health records using natural language processing. Med Care 2016;54:e9–e14. (PMID: 10.1097/MLR.0b013e3182a30373239294033875602)
Tariq A, Kallas O, Balthazar P, Lee SJ, Desser T, Rubin D, et al. Transfer language space with similar domain adaptation: a case study with hepatocellular carcinoma. J Biomed Semantics 2022;13:8. (PMID: 10.1186/s13326-022-00262-8351971108867666)
Fervers P, Hahnfeldt R, Kottlors J, Wagner A, Maintz D, dos Santos D, et al. ChatGPT yields low accuracy in determining LI-RADS scores based on free-text and structured radiology reports in German language. Frontiers in Radiology 2024;4:1390774. (PMID: 10.3389/fradi.2024.13907743903654211257913)
Wu Q, Wu Q, Li H, Wang Y, Bai Y, Wu Y, et al. Evaluating large language models for automated reporting and data systems categorization: cross-sectional study. JMIR Med Inform 2024;12:e55799. (PMID: 10.2196/557993901810211292156)
Spitzl D, Mergen M, Bauer U, Jungmann F, Bressem KK, Busch F, et al. Leveraging Large Language Models for accurate Classification of Liver Lesions from MRI Reports. Comput Struct Biotechnol J 2025;27:2139–2146. (PMID: 10.1016/j.csbj.2025.05.0194050293112158552)
Liu W, Zhang X, Lv H, Li J, Liu Y, Yang Z, et al. Using a classification model for determining the value of liver radiological reports of patients with colorectal cancer. Front Oncol 2022;12:913806. (PMID: 10.3389/fonc.2022.913806364790859720132)
Redman JS, Natarajan Y, Hou JK, Wang J, Hanif M, Feng H, et al. Accurate identification of fatty liver disease in data warehouse utilizing natural language processing. Dig Dis Sci 2017;62:2713–8. (PMID: 10.1007/s10620-017-4721-928861720)
Chen L, Song L, Shao Y, Li D, Ding K. Using natural language processing to extract clinically useful information from Chinese electronic medical records. Int J Med Inform 2019;124:6–12. (PMID: 10.1016/j.ijmedinf.2019.01.00430784428)
Tay SB, Low GH, Wong GJE, Tey HJ, Leong FL, Li C, et al. Use of natural language processing to infer sites of metastatic disease from radiology reports at scale. JCO Clin Cancer Inform 2024;8:e2300122. (PMID: 10.1200/CCI.23.001223878816611371090)
Li Y, Zheng X, Li J, Dai Q, Wang C-D, Chen M. LKAN: LLM-based knowledge-aware attention network for clinical staging of liver cancer. IEEE J Biomed Health Inform 2024;29:3007–3020. (PMID: 10.1109/JBHI.2024.3478809)
Sheng L, Chen Y, Wei H, Che F, Wu Y, Qin Q, et al. Large Language Models for Diagnosing Focal Liver Lesions From CT/MRI Reports: A Comparative Study With Radiologists. Liver International 2025;45:e70115. (PMID: 10.1111/liv.7011540347005)
Cao JJ, Kwon DH, Ghaziani TT, Kwo P, Tse G, Kesselman A, et al. Large language models’ responses to liver cancer surveillance, diagnosis, and management questions: Accuracy, reliability, readability. Abdominal Radiology 2024;49:4286–4294. (PMID: 10.1007/s00261-024-04501-739088019)
Li Y, Li Z, Li J, Liu L, Liu Y, Zhu B, et al. The actual performance of large language models in providing liver cirrhosis-related information: A comparative study. Int J Med Inform 2025;201:105961. (PMID: 10.1016/j.ijmedinf.2025.10596140334344)
Ge J, Sun S, Owens J, Galvez V, Gologorskaya O, Lai JC, et al. Development of a liver disease–specific large language model chat interface using retrieval-augmented generation. Hepatology 2024;80:1158–68. (PMID: 10.1097/HEP.00000000000008343845196211706764)
Noguerol TM, Paulano-Godino F, Martín-Valdivia MT, Menias CO, Luna A. Strengths, weaknesses, opportunities, and threats analysis of artificial intelligence and machine learning applications in radiology. Journal of the American College of Radiology 2019;16:1239–47. (PMID: 10.1016/j.jacr.2019.05.047)
López-Úbeda P, Martín-Noguerol T, Luna A. Radiology, explicability and AI: closing the gap. Eur Radiol 2023. https://doi.org/10.1007/s00330-023-09902-8 . (PMID: 10.1007/s00330-023-09902-837924343)
Cambria E, Malandri L, Mercorio F, Nobani N, Seveso A. Xai meets llms: A survey of the relation between explainable ai and large language models. ArXiv Preprint ArXiv:240715248 2024. https://doi.org/10.48550/arXiv.2407.15248.
Zhao H, Chen H, Yang F, Liu N, Deng H, Cai H, et al. Explainability for large language models: A survey. ACM Trans Intell Syst Technol 2024;15:1–38.
van Kolfschooten H, van Oirschot J. The EU artificial intelligence act (2024): implications for healthcare. Health Policy (New York) 2024;149:105152. (PMID: 10.1016/j.healthpol.2024.105152)
Palaniappan K, Lin EYT, Vogel S. Global regulatory frameworks for the use of artificial intelligence (AI) in the healthcare services sector. Healthcare, vol. 12, 2024, p. 562.
Martín-Noguerol T, López-Úbeda P, Luna A. AI in radiology: Legal responsibilities and the car paradox. Eur J Radiol 2024;175:111462. https://doi.org/10.1016/j.ejrad.2024.111462 . (PMID: 10.1016/j.ejrad.2024.11146238608500)
Martín-Noguerol T, López-Úbeda P, Luna A. Bidirectional Encoder Representations From Transformers, Bias, and Babel’s Tower in Radiology. Journal of the American College of Radiology 2025;22:519. (PMID: 10.1016/j.jacr.2024.12.01239725342)
Liu Z, Li Y, Shu P, Zhong A, Jiang H, Pan Y, et al. Radiology-GPT: a large language model for radiology. Meta-Radiology 2025;3:100153.
Liu Z, Li Y, Shu P, Zhong A, Yang L, Ju C, et al. ArXiv Preprint ArXiv:230906419 2023. https://doi.org/10.48550/arXiv.2309.06419.
López-Úbeda P, Martín-Noguerol T, Luna A. Is multimodality part of the solution to prevent another AI winter? European Journal of Radiology Artificial Intelligence 2025;2:1000022.
López-Úbeda P, Martín-Noguerol T, Paulano-Godino F, Luna A. Comparative evaluation of image-based vs. text-based vs. multimodal AI approaches for automatic breast density assessment in mammograms. Comput Methods Programs Biomed 2024;255:108334. (PMID: 10.1016/j.cmpb.2024.10833439053353)
*Further Information*
*Liver diseases pose a significant global health challenge due to their silent progression and high mortality. Proper interpretation of radiology reports is essential for the evaluation and management of these conditions but is limited by variability in reporting styles and the complexity of unstructured medical language. In this context, Natural Language Processing (NLP) techniques and Large Language Models (LLMs) have emerged as promising tools to extract relevant clinical information from unstructured liver radiology reports. This work reviews, from a practical point of view, the current state of NLP and LLM applications for liver disease classification, clinical feature extraction, diagnostic support, and staging from reports. It also discusses existing limitations, such as the need for high-quality annotated data, lack of explainability, and challenges in clinical integration. With responsible and validated implementation, these technologies have the potential to transform liver clinical management by enabling faster and more accurate diagnoses and optimizing radiology workflows, ultimately improving patient care in liver diseases.
(© 2025. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.)*
*Declarations. Competing interests: The authors declare no competing interests.*