Deep Learning Architecture with Adaptive Attention and Multi-Scale Fusion for Infrared Spectrum Target Recognition
Abstract
With growing demands for accurate infrared spectrum analysis in industrial, military, and medical applications, traditional methods typically cannot meet the requirements due to limited feature extraction and recognition. This article proposes a novel deep learning model featuring an adaptive attention module, a multi-scale feature fusion module, and a classification decision module, designed to enhance performance. The model is trained using a cross-entropy loss function and learns with backpropagation, employing an exponential decay learning rate policy, over more than 100 training epochs. Experiments are run on three test datasets: NATO RTO SET-103, Thermal IR Benchmark, and FLIR Thermal. The model achieved an average feature extraction accuracy of 90.8% and a target recognition accuracy of 89.7%, which significantly surpassed those of traditional models, such as DenseNet, ResNet, VGGNet, and Basic CNN. The performance was robust in the face of changing data distributions, demonstrating high generalizability and robustness. The result substantiates the model's capability of accurately extracting important infrared features and recognizing targets with high accuracy. This work presents an effective solution to real-world problems in infrared spectrum analysis.
Full Text:
PDFReferences
.https://github.com/dotaball/MCFNet
.Wang J, Song KC, Bao YQ, Huang LM, Yan YH. CGFNet: Cross-Guided Fusion Network for RGB-T Salient Object Detection. Ieee Transactions on Circuits and Systems for Video Technology. 2022;32(5):2949-61. DOI: 10.1109/tcsvt.2021.3099120
.Mo YM, Wang L, Hong WQ, Chu CZ, Li PG, Xia HT. Small-Scale Foreign Object Debris Detection Using Deep Learning and Dual Light Modes. Applied Sciences-Basel. 2024;14(5). DOI: 10.3390/app14052162
.Miao R, Jiang HX, Tian FZ. Robust Ship Detection in Infrared Images through Multiscale Feature Extraction and Lightweight CNN. Sensors. 2022;22(3). DOI: 10.3390/s22031226
.Wei CH, Bai LF, Chen XY, Han J. Cross-Modality Data Augmentation for Aerial Object Detection with Representation Learning. Remote Sensing. 2024;16(24). DOI: 10.3390/rs16244649
.Liu ZY, Zhang XS, Jiang TP, Zhang T, Liu B, Waqas M, et al. Infrared salient object detection based on global guided lightweight non-local deep features. Infrared Physics & Technology. 2021;115. DOI: 10.1016/j.infrared.2021.103672
.Du SH, Han W, Kang ZP, Liao YR, Li ZM. A Convolution Auto-Encoders Network for Aero-Engine Hot Jet FT-IR Spectrum Feature Extraction and Classification. Aerospace. 2024;11(11). DOI: 10.3390/aerospace11110933
.Pan C, Zhao H, Sun M. Real-time target detection system in scenic landscape based on improved YOLOv4 algorithm. Informatica. 2024;48(8). http://dx.doi.org/10.31449/inf.v48i8.5700
.Liu YFX, Jiang WS. Frequency Mining and Complementary Fusion Network for RGB-Infrared Object Detection. Ieee Geoscience and Remote Sensing Letters. 2024;21. DOI: 10.1109/lgrs.2024.3448493
.Zeng CW, Yang ZY, Dai ZX, Gu MJ. Synchronous object detection and matching network based on infrared binocular vision. Journal of Infrared and Millimeter Waves. 2025;44(1):119-29. DOI: 10.11972/j.issn.1001-9014.2025.01.016
.Wang KP, Tu ZZ, Li CL, Zhang C, Luo B. Learning Adaptive Fusion Bank for Multi-Modal Salient Object Detection. Ieee Transactions on Circuits and Systems for Video Technology. 2024;34(8):7344-58. DOI: 10.1109/tcsvt.2024.3375505
.https://www.kaggle.com/datasets/pandrii000/hituav-a-highaltitude-infrared-thermal-dataset
.Gu SY, Zhang X, Zhang J. A full-time deep learning-based alert approach for bridge-ship collision using visible spectrum and thermal infrared cameras. Measurement Science and Technology. 2023;34(9). DOI: 10.1088/1361-6501/acd6ad
.Xu S, Zheng S, Xu W, Xu R, Wang C, Zhang J, et al. HCF-net: Hierarchical context fusion network for infrared small object detection. In: 2024 IEEE International Conference on Multimedia and Expo (ICME). IEEE; 2024. p. 1–6.
.Zhang W, Pan M, Wang P, Xue J, Zhou X, Sun W, et al. Comparative analysis of XGB, CNN, and ResNet models for predicting moisture content in Porphyra yezoensis using near-infrared spectroscopy. Foods. 2024;13(19):3023. http://dx.doi.org/10.3390/foods13193023
.Sharma M, Dhanaraj M, Karnam S, Chachlakis DG, Ptucha R, Markopoulos PP, et al. YOLOrs: Object Detection in Multimodal Remote Sensing Imagery. Ieee Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 2021;14:1497-508. DOI: 10.1109/jstars.2020.3041316
.Iqbal A, Garcia MG, Chellappan L, Gans N. Object detection and classification for small objects in/on water. Journal of Electronic Imaging. 2022;31(3). DOI: 10.1117/1.Jei.31.3.033041
.Li QB, Bi ZQ, Shi DD. Near Infrared Spectral Analysis Algorithms for Traceability of Fishmeal Origin. Spectroscopy and Spectral Analysis. 2020;40(9):2804-8. DOI: 10.3964/j.issn.1000-0593(2020)09-2804-05
.Li H, Zhu W. Art image style conversion based on multi-scale feature fusion network. Informatica. 2024;48(10). http://dx.doi.org/10.31449/inf.v48i10.5960.
.https://www.flir.in/oem/adas/adas-dataset-form/
.Xu X, Fu C, Gao Y, Kang Y, Zhang W. Research on the identification method of maize seed origin using NIR spectroscopy and GAF-VGGNet. Agriculture. 2024;14(3):466. http://dx.doi.org/10.3390/agriculture14030466.
DOI: https://doi.org/10.31449/inf.v49i8.9389
 
	This work is licensed under a Creative Commons Attribution 3.0 License.
 








