Hierarchical Local-Global Attention in a Multi-Scale Transformer Network for Enhanced Image Denoising

Huimin Chang, Qihui Ding

Abstract


Image denoising aims to remove noise from contaminated images. With the increasing complexity of noise in real-world scenarios, current denoising methods struggle to effectively address this challenge. This paper proposes a Multi-Scale Transformer Network (MST-Net) for image denoising. First, we introduce a novel multi-scale patch embedding strategy. In this process, noisy images are divided into patches of varying scales to capture multi-scale features. Second, we propose a Hierarchical Local-Global Attention (HLGA) mechanism in MST-Net. The proposed HLGA initially produces local attention within each scale, which is then integrated with global attention to generate the final attention map. Consequently, our MSTNet can capture long-range dependencies at multiple scales, effectively reducing complex noise in the denoising process. Additionally, we introduce a cross-scale feature fusion module to enhance information integration across different scales. Extensive experiments on standard benchmarks, including Set12, BSD68, CBSD68, and Urban100 datasets, demonstrate that the proposed MST-Net achieves state-of-theart performance. Specifically, MST-Net outperforms existing methods by up to 0.17 dB PSNR improvement on Set12 and 0.15 dB on BSD68 at higher noise levels (σ=75). Moreover, on color image datasets, MST-Net shows consistent enhancements, achieving up to 0.13 dB PSNR gain on Urban100. These results highlight the effectiveness of MST-Net in handling diverse noise patterns while maintaining a balance between computational efficiency and denoising performance. The proposed approach offers a practical solution for real-world image denoising applications.


Full Text:

PDF

References


A. Buades, B. Coll, J.M. Morel. A non-local algorithm for image denoising. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 2005, pp. 60-65.

V. Karnati, M. Uliyar, S. Dey. Fast Non-Local algorithm for image denoising. 2009 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 2009, pp. 3873-3876.

C. Karam, K. Hirakawa. Monte-Carlo Acceleration of Bilateral Filter and Non-Local Means. IEEE Transactions on Image Processing, 27(3): 1462-1474, 2018.

M. P. Nguyen, S. Y. Chun. Bounded Self-Weights Estimation Method for Non-Local Means Image Denoising Using Minimax Estimators. IEEE Transactions on Image Processing, 26(4): 1637-1649, 2017.

J. R. Liao, C. Y. Chan. Efficient Implementation of Non-Local Means Image Denoising Algorithm. 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), Osaka, Japan, 2019, pp. 566-567.

C. Tian, Y. Xu, Z. Li, et al. Attention-guided CNN for image denoising. Neural Networks, 124: 5596-5610.

K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing, 26(7): 3142-3155, 2017.

Chen, J., Lu, Y., Xu, Z. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv preprint arXiv:2102.04306, 2021.

Q. Shi, X. Tang, T. Yang, R. Liu, L. Zhang. Hyperspectral Image Denoising Using a 3-D Attention Denoising Network. IEEE Transactions on Geoscience and Remote Sensing, 59(12): 10348-10363, 2021.

L. I. Rudin, S. Osher, E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60(1-4): 259-268, 1992.

K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering. IEEE Transactions on Image Processing, 16(8): 2080-2095, 2007.

K. Zhang, W. Zuo, L. Zhang. FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising. IEEE Transactions on Image Processing, 27(9): 4608-4622, 2018.

P. Liu, H. Zhang, K. Zhang, L. Lin, W. Zuo. Multi-level Wavelet-CNN for Image Restoration. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1873-1882, 2018.

S. Guo, Z. Yan, K. Zhang, W. Zuo, L. Zhang. Toward Convolutional Blind Denoising of Real Photographs. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1712-1722, 2019.

D. Liu, B. Wen, Y. Fan, C. C. Loy, T. S. Huang. Non-Local Recurrent Network for Image Restoration. Advances in Neural Information Processing Systems, 31, 2018.

Y. Zhang, K. Li, K. Li, B. Zhong, Y. Fu. Residual Non-local Attention Networks for Image Restoration. International Conference on Learning Representations, 2019.

Z. Yue, H. Yong, Q. Zhao, D. Meng, L. Zhang. Variational Denoising Network: Toward Blind Noise Modeling and Removal. Advances in Neural Information Processing Systems, 32, 2019.

A. Dosovitskiy et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR 2021.

H. Chen et al. Pre-Trained Image Processing Transformer. CVPR 2021.

Z. Wang et al. Uformer: A General U-Shaped Transformer for Image Restoration. CVPR 2022.

J. Liang et al. SwinIR: Image Restoration Using Swin Transformer. ICCV 2021.

S. Zamir et al. Restormer: Efficient Transformer for High-Resolution Image Restoration. CVPR 2022.

H. Valanarasu et al. TransWeather: Transformer-Based Restoration of Images Degraded by Adverse Weather Conditions. CVPR 2022.

K. Zhang et al. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. ICCV 2021.

X. Chen et al. TransCNN: Transformer in Convolutional Neural Network for Image Restoration. arXiv:2211.08889, 2022.

Z. Tu et al. MAXIM: Multi-Axis MLP for Image Processing. CVPR 2022.

L. Chen et al. HINet: Half Instance Normalization Network for Image Restoration. CVPR 2021.

K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations (ICLR), 2015.

K. He, X. Zhang, S. Ren, J. Sun. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026-1034.

K. Zhang, Y. Li, W. Zuo, L. Zhang, L. Van Gool, R. Timofte. Plug-and-Play Image Restoration with Deep Denoiser Prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10): 6360-6376, 2022.

Z. Fan et al. SCUNet: Parallel Squeeze-and-Correlation Networks for Image Denoising. ICCV, 2023.

D. Martin et al. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. ICCV, 2001.

J. B. Huang et al. Single Image Super-Resolution from Transformed Self-Exemplars. CVPR, 2015.




DOI: https://doi.org/10.31449/inf.v49i6.6861

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.