*Result*: Unet-like transformer with variable shifted windows for low dose CT denoising.
*Further Information*
*Low-dose computed tomography (LDCT) is crucial for reducing radiation exposure in medical imaging, but it often yields noisy images with artifacts that compromise diagnostic accuracy. Recently, Transformer-based models have shown great potential for LDCT denoising by modeling long-range dependencies and global context. However, standard Transformers incur prohibitive computational costs when applied to high-resolution medical images. To address this challenge, we propose a novel pure Transformer architecture for LDCT image restoration, designed within a hierarchical U-Net framework. The core of our innovation is the integration of an agent attention mechanism into a variable shifted-window design. This agent attention module efficiently approximates global self-attention by using a small set of agent tokens to aggregate and broadcast global contextual information, thereby achieving a global receptive field with only linear computational complexity. By embedding this mechanism within a multi-scale U-Net structure, our model effectively captures both fine-grained local details and long-range structural dependencies without sacrificing computational efficiency. Comprehensive experiments on a public LDCT dataset demonstrate that our method achieves state-of-the-art performance, outperforming existing approaches in both quantitative metrics and qualitative visual comparisons.
(© 2026 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.)*