*Result*: Join query optimization in distributed database based on multi-source mating selection evolutionary algorithm.

Title:
Join query optimization in distributed database based on multi-source mating selection evolutionary algorithm.
Authors:
Du, Yan1 (AUTHOR) duyan@emails.bjut.edu.cn, Ding, Zhiming2 (AUTHOR) zhiming@iscas.ac.cn, Cai, Zhi1 (AUTHOR) caiz@bjut.edu.cn, Chi, Yuanying1 (AUTHOR) goodcyy@bjut.edu.cn
Source:
Cluster Computing. Oct2025, Vol. 28 Issue 5, p1-23. 23p.
Database:
Academic Search Index

*Further Information*

*In a distributed database system, the data is distributed on multiple sites in the cluster. So for join queries involving large amount of data access and complex computation, how to efficiently use each site to complete data reading and computation is one of the key issues in query optimization. With the development of network communication technology, the cost of data transmission in network is no longer the only factor limiting the query efficiency, especially for distributed databases deployed in high-speed local area networks, the cost of CPU computation of local sites and the cost of data I/O also need to be considered. In this regard, a multi-source mating selection based differential evolutionary artificial bee colony algorithm is proposed in this paper to solve the distributed database query optimization problem under high-speed local area network deployment. In this algorithm, the population is first initialized using the good node set method so that the population can be more evenly distributed in the feasible domain, and then the genetic algorithm is combined with the artificial bee colony algorithm to improve the performance of the algorithm. At the same time, spectral clustering is introduced to mine the regular characteristics of the population, and a multi-source mating selection and recombination operator is designed to guide the algorithm search based on the obtained structured information of the population, which can accelerate the convergence of the algorithm by using the recombination of similar individuals while maintaining the diversity of the population by setting multiple sources of mating selection for each individual. Finally, simulation comparison experiments are conducted with other methods under different query sizes, and the results show that the proposed method is able to produce less costly query execution plans. And to a certain extent, it is able to reduce the query response time and improve the query efficiency. [ABSTRACT FROM AUTHOR]*