Treffer: Pipeline to explore information on genome editing using large language models and genome editing meta-database.

Title:
Pipeline to explore information on genome editing using large language models and genome editing meta-database.
Authors:
Suzuki T; Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima 739-0046, Japan., Bono H; Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima 739-0046, Japan.; Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima 739-0046, Japan.
Source:
Database : the journal of biological databases and curation [Database (Oxford)] 2025 Mar 08; Vol. 2025.
Publication Type:
Journal Article; Research Support, Non-U.S. Gov't
Language:
English
Journal Info:
Publisher: Oxford Journals Country of Publication: England NLM ID: 101517697 Publication Model: Print Cited Medium: Internet ISSN: 1758-0463 (Electronic) Linking ISSN: 17580463 NLM ISO Abbreviation: Database (Oxford) Subsets: MEDLINE
Imprint Name(s):
Original Publication: Oxford : Oxford Journals, 2009-
References:
Nature. 2016 May 19;533(7603):420-4. (PMID: 27096365)
Genetics. 2010 Oct;186(2):757-61. (PMID: 20660643)
Nucleic Acids Res. 2024 Jul 5;52(W1):W540-W546. (PMID: 38572754)
Nat Commun. 2024 Feb 15;15(1):1418. (PMID: 38360817)
Mol Plant. 2019 Feb 4;12(2):127-129. (PMID: 30639750)
Proc Natl Acad Sci U S A. 1996 Feb 6;93(3):1156-60. (PMID: 8577732)
Nature. 2019 Dec;576(7785):149-157. (PMID: 31634902)
FEBS J. 2015 Apr;282(8):1383-93. (PMID: 25728500)
Science. 2012 Aug 17;337(6096):816-21. (PMID: 22745249)
PLoS Biol. 2023 Aug 8;21(8):e3002222. (PMID: 37552676)
Cell. 2014 Oct 23;159(3):647-61. (PMID: 25307932)
NPJ Parkinsons Dis. 2024 Aug 17;10(1):160. (PMID: 39154038)
Elife. 2024 Mar 28;12:. (PMID: 38546716)
Sci Rep. 2016 Jan 27;6:19675. (PMID: 26813419)
Trends Genet. 2021 Nov;37(11):958-962. (PMID: 34392967)
Database (Oxford). 2016 Feb 20;2016:. (PMID: 26896844)
PLoS Biol. 2018 Sep 18;16(9):e2006643. (PMID: 30226837)
Nat Methods. 2022 Jul;19(7):774-779. (PMID: 35534633)
Exp Mol Med. 2024 Apr;56(4):861-869. (PMID: 38556550)
Genes (Basel). 2022 Jan 21;13(2):. (PMID: 35205234)
Nat Commun. 2024 Feb 21;15(1):1569. (PMID: 38383556)
Cell. 2024 Feb 29;187(5):1076-1100. (PMID: 38428389)
Nat Biotechnol. 2014 Jul;32(7):670-6. (PMID: 24752079)
Nat Biotechnol. 2014 Dec;32(12):1262-7. (PMID: 25184501)
Cold Spring Harb Perspect Biol. 2016 Dec 1;8(12):. (PMID: 27908936)
Trends Genet. 2024 Jan;40(1):15-16. (PMID: 37968205)
J Biomed Sci. 2023 Jul 1;30(1):51. (PMID: 37393268)
Environ Toxicol. 2022 Jul;37(7):1629-1641. (PMID: 35258167)
Grant Information:
JPMJFS2129 University Fellowship Creation Project for Creating Scientific and Technological Innovation; JPMJPF2010 Program on Open Innovation Platform with Enterprises, Research Institute and Academia
Molecular Sequence:
figshare 10.6084/m9.figshare.c.7497327
Entry Date(s):
Date Created: 20250308 Date Completed: 20250512 Latest Revision: 20260309
Update Code:
20260309
PubMed Central ID:
PMC11890094
DOI:
10.1093/database/baaf022
PMID:
40056431
Database:
MEDLINE

Weitere Informationen

Genome editing (GE) is widely recognized as an effective and valuable technology in life sciences research. However, certain genes are difficult to edit depending on some factors such as the type of species, sequences, and GE tools. Therefore, confirming the presence or absence of GE practices in previous publications is crucial for the effective designing and establishment of research using GE. Although the Genome Editing Meta-database (GEM: https://bonohu.hiroshima-u.ac.jp/gem/) aims to provide as comprehensive GE information as possible, it does not indicate how each registered gene is involved in GE. In this study, we developed a systematic method for extracting essential GE information using large language models from the information based on GEM and GE-related articles. This approach allows for a systematic and efficient investigation of GE information that cannot be achieved using the current GEM alone. In addition, by converting the extracted GE information into metrics, we propose a potential application of this method to prioritize genes for future research. The extracted GE information and novel GE-related scores are expected to facilitate the efficient selection of target genes for GE and support the design of research using GE. Database URLs:  https://github.com/szktkyk/extract_geinfo, https://github.com/szktkyk/visualize_geinfo.
(© The Author(s) 2025. Published by Oxford University Press.)