*Result*: RL-I2IT: Image-to-image translation with deep reinforcement learning.
*Further Information*
*Most existing Image-to-Image Translation (I2IT) methods generate images in a single run of deep learning (DL) models. However, designing a single-step model often requires many parameters and suffers from overfitting. Inspired by the analogy between diffusion models and reinforcement learning, we reformulate I2IT as an iterative decision-making problem via deep reinforcement learning (DRL) and propose a computationally efficient RL-based I2IT (RL-I2IT) framework. The key feature in the RL-I2IT framework is to decompose a monolithic learning process into small steps with a lightweight model to progressively transform the source image to the target image. Considering the challenge of handling high-dimensional continuous state and action spaces in the conventional RL framework, we introduce meta policy with a new "concept Plan" to the standard Actor-Critic model. This plan is of a lower dimension than the original image, which facilitates the actor to generate a tractable high-dimensional action. In the RL-I2IT framework, we also employ a task-specific auxiliary learning strategy to stabilize the training process and improve the performance of the corresponding task. Experiments on several I2IT tasks demonstrate the effectiveness and robustness of the proposed method when facing high-dimensional continuous action space problems. Our implementation of the RL-I2IT framework is available at https://github.com/lesley222/RL-I2IT.
(Copyright © 2025. Published by Elsevier Ltd.)*
*Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.*