*Result*: Unet-like transformer with variable shifted windows for low dose CT denoising.

Title:
Unet-like transformer with variable shifted windows for low dose CT denoising.
Authors:
Li J; Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, People's Republic of China., Qi F; Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, People's Republic of China., Li Y; Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, People's Republic of China., Chen J; Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, People's Republic of China., Pu Y; Dongguan University of Technology, Guandong Province, Dongguan, People's Republic of China., Wang S; Institute of High Energy Physics, Chinese Academy of Sciences, Beijing, People's Republic of China.
Source:
Biomedical physics & engineering express [Biomed Phys Eng Express] 2026 Mar 18; Vol. 12 (2). Date of Electronic Publication: 2026 Mar 18.
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: IOP Publishing Ltd Country of Publication: England NLM ID: 101675002 Publication Model: Electronic Cited Medium: Internet ISSN: 2057-1976 (Electronic) Linking ISSN: 20571976 NLM ISO Abbreviation: Biomed Phys Eng Express Subsets: MEDLINE
Imprint Name(s):
Original Publication: Bristol : IOP Publishing Ltd., [2015]-
Contributed Indexing:
Keywords: CT image denoising; agent attention; medical image processing; transformer architecture; variable shifted windows
Entry Date(s):
Date Created: 20260204 Date Completed: 20260318 Latest Revision: 20260318
Update Code:
20260319
DOI:
10.1088/2057-1976/ae41c5
PMID:
41637761
Database:
MEDLINE

*Further Information*

*Low-dose computed tomography (LDCT) is crucial for reducing radiation exposure in medical imaging, but it often yields noisy images with artifacts that compromise diagnostic accuracy. Recently, Transformer-based models have shown great potential for LDCT denoising by modeling long-range dependencies and global context. However, standard Transformers incur prohibitive computational costs when applied to high-resolution medical images. To address this challenge, we propose a novel pure Transformer architecture for LDCT image restoration, designed within a hierarchical U-Net framework. The core of our innovation is the integration of an agent attention mechanism into a variable shifted-window design. This agent attention module efficiently approximates global self-attention by using a small set of agent tokens to aggregate and broadcast global contextual information, thereby achieving a global receptive field with only linear computational complexity. By embedding this mechanism within a multi-scale U-Net structure, our model effectively captures both fine-grained local details and long-range structural dependencies without sacrificing computational efficiency. Comprehensive experiments on a public LDCT dataset demonstrate that our method achieves state-of-the-art performance, outperforming existing approaches in both quantitative metrics and qualitative visual comparisons.
(© 2026 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.)*