Hierarchical Local-Global Attention in a Multi-Scale Transformer Network for Enhanced Image Denoising
Abstract
Image denoising aims to remove noise from contaminated images. With the increasing complexity of noise in real-world scenarios, current denoising methods struggle to effectively address this challenge. This paper proposes a Multi-Scale Transformer Network (MST-Net) for image denoising. First, we introduce a novel multi-scale patch embedding strategy. In this process, noisy images are divided into patches of varying scales to capture multi-scale features. Second, we propose a Hierarchical Local-Global Attention (HLGA) mechanism in MST-Net. The proposed HLGA initially produces local attention within each scale, which is then integrated with global attention to generate the final attention map. Consequently, our MSTNet can capture long-range dependencies at multiple scales, effectively reducing complex noise in the denoising process. Additionally, we introduce a cross-scale feature fusion module to enhance information integration across different scales. Extensive experiments on standard benchmarks, including Set12, BSD68, CBSD68, and Urban100 datasets, demonstrate that the proposed MST-Net achieves state-of-theart performance. Specifically, MST-Net outperforms existing methods by up to 0.17 dB PSNR improvement on Set12 and 0.15 dB on BSD68 at higher noise levels (σ=75). Moreover, on color image datasets, MST-Net shows consistent enhancements, achieving up to 0.13 dB PSNR gain on Urban100. These results highlight the effectiveness of MST-Net in handling diverse noise patterns while maintaining a balance between computational efficiency and denoising performance. The proposed approach offers a practical solution for real-world image denoising applications.
Full Text:
PDFReferences
A. Buades, B. Coll, J.M. Morel. A non-local algorithm for image denoising. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 2005, pp. 60-65.
V. Karnati, M. Uliyar, S. Dey. Fast Non-Local algorithm for image denoising. 2009 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 2009, pp. 3873-3876.
C. Karam, K. Hirakawa. Monte-Carlo Acceleration of Bilateral Filter and Non-Local Means. IEEE Transactions on Image Processing, 27(3): 1462-1474, 2018.
M. P. Nguyen, S. Y. Chun. Bounded Self-Weights Estimation Method for Non-Local Means Image Denoising Using Minimax Estimators. IEEE Transactions on Image Processing, 26(4): 1637-1649, 2017.
J. R. Liao, C. Y. Chan. Efficient Implementation of Non-Local Means Image Denoising Algorithm. 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), Osaka, Japan, 2019, pp. 566-567.
C. Tian, Y. Xu, Z. Li, et al. Attention-guided CNN for image denoising. Neural Networks, 124: 5596-5610.
K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing, 26(7): 3142-3155, 2017.
Chen, J., Lu, Y., Xu, Z. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv preprint arXiv:2102.04306, 2021.
Q. Shi, X. Tang, T. Yang, R. Liu, L. Zhang. Hyperspectral Image Denoising Using a 3-D Attention Denoising Network. IEEE Transactions on Geoscience and Remote Sensing, 59(12): 10348-10363, 2021.
L. I. Rudin, S. Osher, E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60(1-4): 259-268, 1992.
K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering. IEEE Transactions on Image Processing, 16(8): 2080-2095, 2007.
K. Zhang, W. Zuo, L. Zhang. FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising. IEEE Transactions on Image Processing, 27(9): 4608-4622, 2018.
P. Liu, H. Zhang, K. Zhang, L. Lin, W. Zuo. Multi-level Wavelet-CNN for Image Restoration. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1873-1882, 2018.
S. Guo, Z. Yan, K. Zhang, W. Zuo, L. Zhang. Toward Convolutional Blind Denoising of Real Photographs. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1712-1722, 2019.
D. Liu, B. Wen, Y. Fan, C. C. Loy, T. S. Huang. Non-Local Recurrent Network for Image Restoration. Advances in Neural Information Processing Systems, 31, 2018.
Y. Zhang, K. Li, K. Li, B. Zhong, Y. Fu. Residual Non-local Attention Networks for Image Restoration. International Conference on Learning Representations, 2019.
Z. Yue, H. Yong, Q. Zhao, D. Meng, L. Zhang. Variational Denoising Network: Toward Blind Noise Modeling and Removal. Advances in Neural Information Processing Systems, 32, 2019.
A. Dosovitskiy et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR 2021.
H. Chen et al. Pre-Trained Image Processing Transformer. CVPR 2021.
Z. Wang et al. Uformer: A General U-Shaped Transformer for Image Restoration. CVPR 2022.
J. Liang et al. SwinIR: Image Restoration Using Swin Transformer. ICCV 2021.
S. Zamir et al. Restormer: Efficient Transformer for High-Resolution Image Restoration. CVPR 2022.
H. Valanarasu et al. TransWeather: Transformer-Based Restoration of Images Degraded by Adverse Weather Conditions. CVPR 2022.
K. Zhang et al. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. ICCV 2021.
X. Chen et al. TransCNN: Transformer in Convolutional Neural Network for Image Restoration. arXiv:2211.08889, 2022.
Z. Tu et al. MAXIM: Multi-Axis MLP for Image Processing. CVPR 2022.
L. Chen et al. HINet: Half Instance Normalization Network for Image Restoration. CVPR 2021.
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations (ICLR), 2015.
K. He, X. Zhang, S. Ren, J. Sun. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026-1034.
K. Zhang, Y. Li, W. Zuo, L. Zhang, L. Van Gool, R. Timofte. Plug-and-Play Image Restoration with Deep Denoiser Prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10): 6360-6376, 2022.
Z. Fan et al. SCUNet: Parallel Squeeze-and-Correlation Networks for Image Denoising. ICCV, 2023.
D. Martin et al. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. ICCV, 2001.
J. B. Huang et al. Single Image Super-Resolution from Transformed Self-Exemplars. CVPR, 2015.
DOI: https://doi.org/10.31449/inf.v49i6.6861
This work is licensed under a Creative Commons Attribution 3.0 License.