Result: LLM-IE: a python package for biomedical generative information extraction with large language models.

Title:

LLM-IE: a python package for biomedical generative information extraction with large language models.

Authors:

Hsu E; McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, United States.; Enterprise Development and Integration, University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States., Roberts K; McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, United States.

Source:

JAMIA open [JAMIA Open] 2025 Mar 12; Vol. 8 (2), pp. ooaf012. Date of Electronic Publication: 2025 Mar 12 (Print Publication: 2025).

Publication Type:

Journal Article

Language:

English

Journal Info:

Publisher: Oxford University Press on behalf of the American Medical Informatics Association Country of Publication: United States NLM ID: 101730643 Publication Model: eCollection Cited Medium: Internet ISSN: 2574-2531 (Electronic) Linking ISSN: 25742531 NLM ISO Abbreviation: JAMIA Open Subsets: PubMed not MEDLINE

Imprint Name(s):

Original Publication: [Cary, NC] : Oxford University Press on behalf of the American Medical Informatics Association, [2018]-

References:

Sci Rep. 2025 Mar 10;15(1):8241. (PMID: 40064991)
Nat Commun. 2024 Feb 15;15(1):1418. (PMID: 38360817)
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589. (PMID: 37674787)
J Am Med Inform Assoc. 2020 Jan 1;27(1):3-12. (PMID: 31584655)
BMC Bioinformatics. 2023 Jul 19;24(1):290. (PMID: 37468830)
J Am Med Inform Assoc. 2024 Sep 1;31(9):1812-1820. (PMID: 38281112)
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13. (PMID: 23564629)
J Biomed Inform. 2015 Dec;58 Suppl:S11-S19. (PMID: 26225918)

Grant Information:

R01 LM011934 United States LM NLM NIH HHS

Contributed Indexing:

Keywords: information extraction; large language models; named entity recognition; natural language processing; relation extraction

Entry Date(s):

Date Created: 20250313 Latest Revision: 20250329

Update Code:

20260130

PubMed Central ID:

PMC11901043

DOI:

10.1093/jamiaopen/ooaf012

PMID:

40078164

Database:

MEDLINE

Further Information

*Objectives: Despite the recent adoption of large language models (LLMs) for biomedical information extraction (IE), challenges in prompt engineering and algorithms persist, with no dedicated software available. To address this, we developed LLM-IE: a Python package for building complete IE pipelines.
Materials and Methods: The LLM-IE supports named entity recognition, entity attribute extraction, and relation extraction tasks. We benchmarked it on the i2b2 clinical datasets.
Results: The sentence-based prompting algorithm resulted in the best 8-shot performance of over 70% strict F1 for entity extraction and about 60% F1 for entity attribute extraction.
Discussion: We developed a Python package, LLM-IE, highlighting (1) an interactive LLM agent to support schema definition and prompt design, (2) state-of-the-art prompting algorithms, and (3) visualization features.
Conclusion: The LLM-IE provides essential building blocks for developing robust information extraction pipelines. Future work will aim to expand its features and further optimize computational efficiency.
(© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association.)*

*The authors have no competing interests to declare.*

*Result*: LLM-IE: a python package for biomedical generative information extraction with large language models.

*Further Information*

*Links*

*Additional functions*

Result: LLM-IE: a python package for biomedical generative information extraction with large language models.

Further Information

Links

Additional functions