Hybrid Deep Learning Model for Multi-Source Remote Sensing Data Fusion: Integrating DenseNet and Swin Transformer for Spatial Alignment and Feature Extraction
Abstract
The integration of multi-source remote sensing data, including Synthetic Aperture Radar (SAR), optical, and hyperspectral imagery, is critical for enhancing Earth Observation Systems but is challenged by highdimensional variability and spatial misalignment. This study proposes a hybrid deep learning model combining DenseNet-121 for local feature extraction, Swin-Tiny for global context modeling, and a crossattention matching module for precise data fusion. The methodology involves preprocessing (normalization, resizing to 224x224, and augmentation), feature extraction, hierarchical refinement, and similaritybased alignment. Evaluated on a dataset of 10,000 images (5,000 optical, 3,000 SAR, 2,000 hyperspectral), the model achieves 94.6% accuracy, 88.7% SSIM, and 15.4% RMSE, outperforming DenseNet-only (89.5% accuracy, 82.3% SSIM, 19.8% RMSE) and Swin Transformer-only (91.0% accuracy, 85.1% SSIM, 17.2% RMSE) baselines. It also surpasses state-of-the-art methods like SwinV2DNet (92.3%) and STransFuse (90.8%) by 2.3-3.8% in accuracy. With an inference time of 0.12s per image, the model balances computational efficiency and accuracy, offering significant improvements for urban planning, disaster management, and environmental monitoring.
Full Text:
PDFReferences
H. M. Albarakati, M. A. Khan, A. Hamza, F. Khan, N. Kraiem, L. Jamel, L. Almuqren, and R. Alroobaea, “A novel deep learning architecture for agriculture land cover and land use classification from remote sensing images based on network-level fusion of self-attention architecture,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024.
W. Chen, X. Li, X. Qin, and L. Wang, “Remote sensing lithology intelligent segmentation based on multi-source data,” in Remote Sensing Intelligent Interpretation for Geology: From Perspective of Geological Exploration, pp. 117–163, Springer, 2024.
L. Gao, H. Liu, M. Yang, L. Chen, Y. Wan, Z. Xiao, and Y. Qian, “Stransfuse: Fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 10990–11003, 2021.
S. Hao, N. Li, and Y. Ye, “Inductive biased swin-transformer with cyclic regressor for remote sensing scene classification,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023.
J. He, Q. Yuan, J. Li, Y. Xiao, X. Liu, and Y. Zou, “Dster: A dense spectral transformer for remote sensing spectral super-resolution,” International Journal of Applied Earth Observation and Geoinformation, vol. 109, p. 102773, 2022.
A. Jha, S. Bose, and B. Banerjee, “Gaf-net: Improving the performance of remote sensing image fusion using novel global self and cross attention learning,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 6354–6363, 2023.
S. Liu, Y. Wang, H. Wang, Y. Xiong, Y. Liu, and C. Xie, “Convolution and transformer based hybrid neural network for road extraction in remote sensing images,” in 2024 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 471–476, IEEE, 2024.
R. Luo, Y. Song, L. Ye, and R. Su, “Dense-tnt: Efficient vehicle type classification neural network using satellite imagery,” Sensors (Basel, Switzerland), vol. 24, no. 23, p. 7662, 2024.
H. Song, Y. Yuan, Z. Ouyang, Y. Yang, and H. Xiang, “Quantitative regularization in robust vision transformer for remote sensing image classification,” The Photogrammetric Record, vol. 39, no. 186, pp. 340–372, 2024.
B. Sun, G. Liu, and Y. Yuan, “F3-net: Multiview scene matching for drone-based geo-localization,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–11, 2023.
A. Thapa, T. Horanont, B. Neupane, and J. Aryal, “Deep learning for remote sensing image scene classification: A review and meta-analysis,” Remote Sensing, vol. 15, no. 19, p. 4804, 2023.
Z. Wang, M. Xia, L. Weng, K. Hu, and H. Lin, “Dual encoder-decoder network for land cover segmentation of remote sensing image,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023.
R. Wang, M. Cai, Z. Xia, and Z. Zhou, “Remote sensing image road segmentation method integrating CNN-transformer and U-Net,” IEEE Access, 2023.
Z. Wang, L. Zhao, J. Meng, Y. Han, X. Li, R. Jiang, J. Chen, and H. Li, “Deep learning-based cloud detection for optical remote sensing images: A survey,” Remote Sensing, vol. 16, no. 23, p. 4583, 2024.
M. Zeng and N. Xiao, “Effective combination of DenseNet and BiLSTM for keyword spotting,” IEEE Access, vol. 7, pp. 10767–10775, 2019.
Y. Xiao, Q. Yuan, K. Jiang, J. He, X. Jin, and L. Zhang, “Ediffsr: An efficient diffusion probabilistic model for remote sensing image super-resolution,” IEEE Transactions on Geoscience and Remote Sensing, 2023.
L. Zhang, L. Zhang, and B. Du, “Deep learning for remote sensing data: A technical tutorial on the state of the art,” IEEE Geoscience and Remote Sensing Magazine, vol. 4, no. 2, pp. 22–40, 2016.
K. Zhang, Y. Guo, X. Wang, J. Yuan, and Q. Ding, “Multiple feature reweight DenseNet for image classification,” IEEE Access, vol. 7, pp. 9872–9880, 2019.
C. Zhang, L. Wang, S. Cheng, and Y. Li, “Swin-SUNet: Pure transformer network for remote sensing image change detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–13, 2022.
J. Zhou, X. Gu, H. Gong, X. Yang, Q. Sun, L. Guo, and Y. Pan, “Intelligent classification of maize straw types from UAV remote sensing images using DenseNet201 deep transfer learning algorithm,” Ecological Indicators, vol. 166, p. 112331, 2024.
DOI: https://doi.org/10.31449/inf.v49i24.8395

This work is licensed under a Creative Commons Attribution 3.0 License.