*Result*: Biomedical Information Integration via Adaptive Large Language Model Construction.

Title:
Biomedical Information Integration via Adaptive Large Language Model Construction.
Authors:
Source:
IEEE journal of biomedical and health informatics [IEEE J Biomed Health Inform] 2025 Sep; Vol. 29 (9), pp. 6381-6394.
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: Institute of Electrical and Electronics Engineers Country of Publication: United States NLM ID: 101604520 Publication Model: Print Cited Medium: Internet ISSN: 2168-2208 (Electronic) Linking ISSN: 21682194 NLM ISO Abbreviation: IEEE J Biomed Health Inform Subsets: MEDLINE
Imprint Name(s):
Original Publication: New York, NY : Institute of Electrical and Electronics Engineers, 2013-
Entry Date(s):
Date Created: 20241111 Date Completed: 20250905 Latest Revision: 20250908
Update Code:
20260130
DOI:
10.1109/JBHI.2024.3496495
PMID:
39527417
Database:
MEDLINE

*Further Information*

*Integrating diverse biomedical knowledge information is essential to enhance the accuracy and efficiency of medical diagnoses, facilitate personalized treatment plans, and ultimately improve patient outcomes. However, Biomedical Information Integration (BII) faces significant challenges due to variations in terminology and the complex structure of entity descriptions across different datasets. A critical step in BII is biomedical entity alignment, which involves accurately identifying and matching equivalent entities across diverse datasets to ensure seamless data integration. In recent years, Large Language Model (LLMs), such as Bidirectional Encoder Representations from Transformers (BERTs), have emerged as valuable tools for discerning heterogeneous biomedical data due to their deep contextual embeddings and bidirectionality. However, different LLMs capture various nuances and complexity levels within the biomedical data, and none of them can ensure their effectiveness in all heterogeneous entity matching tasks. To address this issue, we propose a novel Two-Stage LLM construction (TSLLM) framework to adaptively select and combine LLMs for Biomedical Information Integration (BII). First, a Multi-Objective Genetic Programming (MOGP) algorithm is proposed for generating versatile high-level LLMs, and then, a Single-Objective Genetic Algorithm (SOGA) employs a confidence-based strategy is presented to combine the built LLMs, which can further improve the discriminative power of distinguishing heterogeneous entities. The experiment utilizes OAEI's entity matching datasets, i.e., Benchmark and Conference, along with LargeBio, Disease and Phenotype datasets to test the performance of TSLLM. The experimental findings validate the efficiency of TSLLM in adaptively differentiating heterogeneous biomedical entities, which significantly outperforms the leading entity matching techniques.*