ESI-YOLO: Enhancing YOLOv8 with Efficient Multi-Scale Attention and Wise-IoU for X-Ray Security Inspection

Arinal Haq; Nanik Suciati; Ngoc Dung Bui

doi:10.31763/ijrcs.v5i3.1983


ESI-YOLO: Enhancing YOLOv8 with Efficient Multi-Scale Attention and Wise-IoU for X-Ray Security Inspection

⁽¹⁾ Arinal Haq

(Institut Teknologi Sepuluh Nopember, Indonesia)
^{(2) *} Nanik Suciati

(Institut Teknologi Sepuluh Nopember, Indonesia)
⁽³⁾ Ngoc Dung Bui

(University of Transport and Communications, Viet Nam)
^*corresponding author

Abstract

Security inspection is a priority for preventing threats and criminal activities in public places. X-ray imaging can help with the closed luggages checking process. However, interpreting X-ray images is challenging due to the complexity and diversity of prohibited items. This paper proposes ESI-YOLO, an enhanced YOLOv8-based model for prohibited item detection in X-ray security inspection. The model integrates Efficient Multi-Scale Attention (EMA) and Wise-IoU (WIoU) loss function to improve multi-scale feature representation and detection accuracy. EMA improves multi-scale feature representation, while WIoU enhances bounding box regression, particularly in cluttered and overlapping scenarios. Comprehensive experiments on the CLCXray and PIDray datasets validate the effectiveness of ESI-YOLO. A systematic exploration for the optimal placement of EMA integration on YOLOv8 architecture reveals that the scenario with direct integration in both backbone and neck sections emerges as the most effective configuration without introducing significant computational complexity. Ablation experiments demonstrate the synergistic effect of combining EMA and WIoU in ESI-YOLO, outperforming individual component additions. ESI-YOLO demonstrates notable advancements over the baseline YOLOv8 model, achieving mAP50 improvements of 0.9% on CLCXray and 3.5% on the challenging hidden subset of PIDray, with a computational cost of 8.4 GFLOPs. Compared to other nano-sized models, ESI-YOLO exhibits enhanced accuracy while maintaining computational efficiency, making it a promising solution for practical X-ray security inspection systems.

Keywords

YOLOv8; X-Ray Security Inspection; Efficient Multi-Scale Attention (EMA); Wise-IoU; Attention Mechanism

DOI

https://doi.org/10.31763/ijrcs.v5i3.1983

Article metrics

10.31763/ijrcs.v5i3.1983 Abstract views : 152 | PDF views : 54

Cite

How to cite item

Full Text

Download

References

[1] D. Li, X. Hu, H. Zhang, and J. Yang, “A GAN based method for multiple prohibited items synthesis of X-ray security image,” Optoelectronics Letters, vol. 17, no. 2, pp. 112-117, 2021, https://doi.org/10.1007/s11801-021-0032-7.

[2] J. Wu, X. Xu and J. Yang, "Object Detection and X-Ray Security Imaging: A Survey," IEEE Access, vol. 11, pp. 45416-45441, 2023, https://doi.org/10.1109/ACCESS.2023.3273736.

[3] M. Chouai, M. Merah, and M. Mimi, “CH-Net: Deep adversarial autoencoders for semantic segmentation in X-ray images of cabin baggage screening at airports,” Journal of Transportation Security, vol. 13, no. 1-2, pp. 71-89, 2020, https://doi.org/10.1007/s12198-020-00211-5.

[4] X. Ji et al., “Filtered selective search and evenly distributed convolutional neural networks for casting defects recognition,” Journal of Materials Processing Technology, vol. 292, p. 117064, 2021, https://doi.org/10.1016/j.jmatprotec.2021.117064.

[5] C. Miao et al., "SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2114-2123, 2019, https://doi.org/10.1109/CVPR.2019.00222.

[6] L. Zhang, L. Jiang, R. Ji, and H. Fan, “PIDray: A Large-Scale X-ray Benchmark for Real-World Prohibited Item Detection,” International Journal of Computer Vision, vol. 131, no. 12, pp. 3170-3192, 2023, https://doi.org/10.1007/s11263-023-01855-1.

[7] C. Zhao, L. Zhu, S. Dou, W. Deng and L. Wang, "Detecting Overlapped Objects in X-Ray Security Imagery by a Label-Aware Mechanism," IEEE Transactions on Information Forensics and Security, vol. 17, pp. 998-1009, 2022, https://doi.org/10.1109/TIFS.2022.3154287.

[8] T. Viriyasaranon, S.-H. Chae, and J.-H. Choi, “MFA-net: Object detection for complex X-ray cargo and baggage security imagery,” PLOS ONE, vol. 17, no. 9, p. e0272961, 2022, https://doi.org/10.1371/journal.pone.0272961.

[9] M. Berger, Q. Yang, and A. Maier, “X-ray Imaging,” Medical Imaging Systems, pp. 119-145, 2018, https://doi.org/10.1007/978-3-319-96520-8_7.

[10] X. Pei, C. Ma, J. Zhou, J. Yang, and Y. Xu, “Contraband detection algorithm for X?ray security inspection images based on global semantic enhancement,” IET Image Processing, vol. 18, no. 13, pp. 4356-4367, 2024, https://doi.org/10.1049/ipr2.13256.

[11] N. Gan et al., “YOLO-CID: Improved YOLOv7 for X-ray Contraband Image Detection,” Electronics, vol. 12, no. 17, p. 3636, 2023, https://doi.org/10.3390/electronics12173636.

[12] Y. Ren, H. Zhang, H. Sun, G. Ma, J. Ren, and J. Yang, “LightRay: Lightweight network for prohibited items detection in X-ray images during security inspection,” Computers & Electrical Engineering, vol. 103, p. 108283, 2022, https://doi.org/10.1016/j.compeleceng.2022.108283.

[13] C. Liqun and J. Yaqin, "Improved x-ray prohibited items detection algorithm for YOLOv7," 2023 IEEE 6th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), pp. 505-510, 2023, https://doi.org/10.1109/AUTEEE60196.2023.10407853.

[14] W. Teng and H. Zhang, Y. Zhang, “X-ray Security Inspection Prohibited Items Detection Model based on Improved YOLOv7-tiny,” IAENG International Journal of Applied Mathematics, vol. 54, no. 7, pp. 1279-1287, 2024, https://www.iaeng.org/IJAM/issues_v54/issue_7/IJAM_54_7_05.pdf.

[15] W. Zhang, Q. Zhu, Y. Li, and H. Li, “MAM Faster R-CNN: Improved Faster R-CNN based on Malformed Attention Module for object detection on X-ray security inspection,” Digital Signal Processing, vol. 139, p. 104072, 2023, https://doi.org/10.1016/j.dsp.2023.104072.

[16] J. Park, G. An, B.-N. Lee, and H. Seo, “Real-time CNN-based object detection of prohibited items for X-ray security screening,” Radiation Physics and Chemistry, vol. 232, p. 112681, 2025, https://doi.org/10.1016/j.radphyschem.2025.112681.

[17] S. Akcay and T. Breckon, “Towards automatic threat detection: A survey of advances of deep learning within X-ray security imaging,” Pattern Recognition, vol. 122, p. 108245, 2022, https://doi.org/10.1016/j.patcog.2021.108245.

[18] H. Sima, B. Chen, C. Tang, Y. Zhang, and J. Sun, “Multi?Scale Feature Attention?DEtection TRansformer: Multi?Scale Feature Attention for security check object detection,” IET Computer Vision, vol. 18, no. 5, pp. 613-625, 2024, https://doi.org/10.1049/cvi2.12267.

[19] M. Rafiei, J. Raitoharju and A. Iosifidis, "Computer Vision on X-Ray Data in Industrial Production and Security Applications: A Comprehensive Survey," IEEE Access, vol. 11, pp. 2445-2477, 2023, https://doi.org/10.1109/ACCESS.2023.3234187.

[20] D. Pfeiffer, F. Pfeiffer, and E. Rummeny, “Advanced X-ray Imaging Technology,” Molecular Imaging in Oncology, pp. 3-30, 2020, https://doi.org/10.1007/978-3-030-42618-7_1.

[21] D. Velayudhan, T. Hassan, E. Damiani, and N. Werghi, “Recent Advances in Baggage Threat Detection: A Comprehensive and Systematic Survey,” ACM Computing Surveys, vol. 55, no. 8, pp. 1-38, 2023, https://doi.org/10.1145/3549932.

[22] Y. Wei, Y. Liu, and H. Wang, “Cooperative distillation with X-ray images classifiers for prohibited items detection,” Engineering Applications of Artificial Intelligence, vol. 127, p. 107276, 2024, https://doi.org/10.1016/j.engappai.2023.107276.

[23] M. S. H. Shovon, S. J. Mozumder, O. K. Pal, M. F. Mridha, N. Asai and J. Shin, "PlantDet: A Robust Multi-Model Ensemble Method Based on Deep Learning For Plant Disease Detection," IEEE Access, vol. 11, pp. 34846-34859, 2023, https://doi.org/10.1109/ACCESS.2023.3264835.

[24] W. Sarai, N. Monbut, N. Youngchoay, N. Phookriangkrai, T. Sattabun, and T. Siriborvornratanakul, “Enhancing baggage inspection through computer vision analysis of x-ray images,” Journal of Transportation Security, vol. 17, no. 1, p. 1, 2024, https://doi.org/10.1007/s12198-023-00270-4.

[25] Z. Cheng, L. Gao, Y. Wang, Z. Deng, and Y. Tao, “EC-YOLO: Effectual Detection Model for Steel Strip Surface Defects Based on YOLO-V5,” IEEE Access, vol. 12, pp. 62765-62778, 2024, https://doi.org/10.1109/ACCESS.2024.3391353.

[26] R. Gai, Y. Liu and G. Xu, "TL-YOLOv8: A Blueberry Fruit Detection Algorithm Based on Improved YOLOv8 and Transfer Learning," IEEE Access, vol. 12, pp. 86378-86390, 2024, https://doi.org/10.1109/ACCESS.2024.3416332.

[27] H. Ren, F. Jing and S. Li, "DCW-YOLO: Road Object Detection Algorithms for Autonomous Driving," IEEE Access, vol. 13, pp. 125676-125688, 2025, https://doi.org/10.1109/ACCESS.2024.3364681.

[28] Z. Chen, Q. Zhu, X. Zhou, J. Deng and W. Song, "Experimental Study on YOLO-Based Leather Surface Defect Detection," IEEE Access, vol. 12, pp. 32830-32848, 2024, https://doi.org/10.1109/ACCESS.2024.3369705.

[29] L. Chen, G. Li, S. Zhang, W. Mao, and M. Zhang, “YOLO-SAG: An improved wildlife object detection algorithm based on YOLOv8n,” Ecological Informatics, vol. 83, p. 102791, 2024, https://doi.org/10.1016/j.ecoinf.2024.102791.

[30] H. An, Z. Liang, M. Qin, Y. Huang, F. Xiong, and G. Zeng, “Wood defect detection based on the CWB-YOLOv8 algorithm,” Journal of Wood Science, vol. 70, no. 1, p. 26, 2024, https://doi.org/10.1186/s10086-024-02139-z.

[31] E. Casas, L. Ramos, E. Bendek and F. Rivas-Echeverría, "Assessing the Effectiveness of YOLO Architectures for Smoke and Wildfire Detection," IEEE Access, vol. 11, pp. 96554-96583, 2023, https://doi.org/10.1109/ACCESS.2023.3312217.

[32] Z. Liu, R. M. Rasika D. Abeyrathna, R. M. Sampurno, V. M. Nakaguchi, and T. Ahamed, “Faster-YOLO-AP: A lightweight apple detection algorithm based on improved YOLOv8 with a new efficient PDWConv in orchard,” Computers and Electronics in Agriculture, vol. 223, p. 109118, 2024, https://doi.org/10.1016/j.compag.2024.109118.

[33] S. R. Bakana, Y. Zhang, and B. Twala, “WildARe-YOLO: A lightweight and efficient wild animal recognition model,” Ecological Informatics, vol. 80, p. 102541, 2024, https://doi.org/10.1016/j.ecoinf.2024.102541.

[34] M. Cui, Y. Lou, Y. Ge, and K. Wang, “LES-YOLO: A lightweight pinecone detection algorithm based on improved YOLOv4-Tiny network,” Computers and Electronics in Agriculture, vol. 205, p. 107613, 2023, https://doi.org/10.1016/j.compag.2023.107613.

[35] M. Wei and W. Zhan, “YOLO_MRC: A fast and lightweight model for real-time detection and individual counting of Tephritidae pests,” Ecological Informatics, vol. 79, p. 102445, 2024, https://doi.org/10.1016/j.ecoinf.2023.102445.

[36] Z. Diao, X. Huang, H. Liu, and Z. Liu, “LE-YOLOv5: A Lightweight and Efficient Road Damage Detection Algorithm Based on Improved YOLOv5,” International Journal of Intelligent Systems, vol. 2023, no. 1, pp. 1-17, 2023, https://doi.org/10.1155/2023/8879622.

[37] J. Cao, W. Bao, H. Shang, M. Yuan, and Q. Cheng, “GCL-YOLO: A GhostConv-Based Lightweight YOLO Network for UAV Small Object Detection,” Remote Sensing, vol. 15, no. 20, p. 4932, 2023, https://doi.org/10.3390/rs15204932.

[38] F. Guan, H. Zhang, and X. Wang, “An improved YOLOv8 model for prohibited item detection with deformable convolution and dynamic head,” Journal of Real-Time Image Processing, vol. 22, no. 2, p. 84, 2025, https://doi.org/10.1007/s11554-025-01665-3.

[39] L. Han, C. Ma, Y. Liu, J. Jia, and J. Sun, “SC-YOLOv8: A Security Check Model for the Inspection of Prohibited Items in X-ray Images,” Electronics, vol. 12, no. 20, p. 4208, 2023, https://doi.org/10.3390/electronics12204208.

[40] Q. Cheng, T. Lan, Z. Cai and J. Li, "X-YOLO: An Efficient Detection Network of Dangerous Objects in X-Ray Baggage Images," IEEE Signal Processing Letters, vol. 31, pp. 2270-2274, 2024, https://doi.org/10.1109/LSP.2024.3451311.

[41] M. -H. Guo et al., "Attention mechanisms in computer vision: A survey," Computational Visual Media, vol. 8, no. 3, pp. 331-368, 2022, https://doi.org/10.1007/s41095-022-0271-y.

[42] Y. Luo, M. Jiang and Q. Zhao, "Visual Attention in Multi-Label Image Classification," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 820-827, 2019, https://doi.org/10.1109/CVPRW.2019.00110.

[43] X. Yang, “An Overview of the Attention Mechanisms in Computer Vision,” Journal of Physics: Conference Series, vol. 1693, no. 1, p. 012173, 2020, https://doi.org/10.1088/1742-6596/1693/1/012173.

[44] Z. Niu, G. Zhong, and H. Yu, “A review on the attention mechanism of deep learning,” Neurocomputing, vol. 452, pp. 48-62, 2021, https://doi.org/10.1016/j.neucom.2021.03.091.

[45] X. Zhu, D. Cheng, Z. Zhang, S. Lin and J. Dai, "An Empirical Study of Spatial Attention Mechanisms in Deep Networks," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6687-6696, 2019, https://doi.org/10.1109/ICCV.2019.00679.

[46] J. Hu, L. Shen, G. Sun, “Squeeze-and-excitation networks,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132-7141, 2018, https://openaccess.thecvf.com/content_cvpr_2018/html/Hu_Squeeze-and-Excitation_Networks_CVPR_2018_paper.html.

[47] S. Woo, J. Park, J. Y. Lee, I. S. Kweon, “Cbam: Convolutional block attention module,” Proceedings of the European conference on computer vision (ECCV), pp. 3-19, 2018, https://openaccess.thecvf.com/content_ECCV_2018/html/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.html.

[48] Q. Hou, D. Zhou, J. Feng, “Coordinate attention for efficient mobile network design,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13713-13722, 2021, https://openaccess.thecvf.com/content/CVPR2021/html/Hou_Coordinate_Attention_for_Efficient_Mobile_Network_Design_CVPR_2021_paper.html.

[49] R. A. A. Saleh and H. M. Ertunç, “Attention-based deep learning for tire defect detection: Fusing local and global features in an industrial case study,” Expert Systems with Applications, vol. 269, p. 126473, 2025, https://doi.org/10.1016/j.eswa.2025.126473.

[50] X. Nie, M. Duan, H. Ding, B. Hu and E. K. Wong, "Attention Mask R-CNN for Ship Detection and Segmentation From Remote Sensing Images," IEEE Access, vol. 8, pp. 9325-9334, 2020, https://doi.org/10.1109/ACCESS.2020.2964540.

[51] C. Peng, X. Li and Y. Wang, "TD-YOLOA: An Efficient YOLO Network With Attention Mechanism for Tire Defect Detection," IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-11, 2023, https://doi.org/10.1109/TIM.2023.3312753.

[52] S. Han, X. Jiang and Z. Wu, "An Improved YOLOv5 Algorithm for Wood Defect Detection Based on Attention," IEEE Access, vol. 11, pp. 71800-71810, 2023, https://doi.org/10.1109/ACCESS.2023.3293864.

[53] M. -A. Chung, Y. -J. Lin and C. -W. Lin, "YOLO-SLD: An Attention Mechanism-Improved YOLO for License Plate Detection," IEEE Access, vol. 12, pp. 89035-89045, 2024, https://doi.org/10.1109/ACCESS.2024.3419587.

[54] G. Mao et al., "SRS-YOLO: Improved YOLOv8-Based Smart Road Stud Detection," IEEE Transactions on Intelligent Transportation Systems, vol. 26, no. 7, pp. 10092-10104, 2025, https://doi.org/10.1109/TITS.2025.3545942.

[55] L. Cao, Q. Wang, Y. Luo, Y. Hou, J. Cao, and W. Zheng, “YOLO-TSL: A lightweight target detection algorithm for UAV infrared images based on Triplet attention and Slim-neck,” Infrared Physics & Technology, vol. 141, p. 105487, 2024, https://doi.org/10.1016/j.infrared.2024.105487.

[56] B. Huang, Y. Ding, G. Liu, G. Tian, and S. Wang, “ASD-YOLO: An aircraft surface defects detection method using deformable convolution and attention mechanism,” Measurement, vol. 238, p. 115300, 2024, https://doi.org/10.1016/j.measurement.2024.115300.

[57] Y. Li, M. Zhang, C. Zhang, H. Liang, P. Li, and W. Zhang, “YOLO-CCS: Vehicle detection algorithm based on coordinate attention mechanism,” Digital Signal Processing, vol. 153, p. 104632, 2024, https://doi.org/10.1016/j.dsp.2024.104632.

[58] G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics yolov8,” Github, 2023, https://github.com/ultralytics/ultralytics.

[59] G. Jocher et al., “ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation,” Zenodo, 2020, https://doi.org/10.5281/zenodo.3908559.

[60] C. -Y. Wang, A. Bochkovskiy and H. -Y. M. Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464-7475, 2023, https://doi.org/10.1109/CVPR52729.2023.00721.

[61] D. Ouyang et al., "Efficient Multi-Scale Attention Module with Cross-Spatial Learning," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1-5, 2023, https://doi.org/10.1109/ICASSP49357.2023.10096516.

[62] Z. Tong, Y. Chen, Z. Xu, and R. Yu, “Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism,” arXiv, 2023, https://doi.org/10.48550/arXiv.2301.10051.

[63] C. Liu, K. Wang, Q. Li, F. Zhao, K. Zhao, and H. Ma, “Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism,” Neural Networks, vol. 170, pp. 276–284, 2024, https://doi.org/10.1016/j.neunet.2023.11.041.

[64] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017, https://doi.org/10.1109/TPAMI.2016.2577031.

[65] T. -Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, "Focal Loss for Dense Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318-327, 2020, https://doi.org/10.1109/TPAMI.2018.2858826.

[66] C. Li et al., “YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications,” arXiv, 2022, https://doi.org/10.48550/arXiv.2209.02976.

[67] G. Jocher and J. Qiu, “Ultralytics YOLO11,” Github, 2024, https://github.com/ultralytics/ultralytics.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

About the Journal	Journal Policies	Author	Information
Focus and Scope Editorial Board International Peer Review Open Access Statement Sponsorships Contact Us Google Scholar Most Cited Paper	Publication Ethics Peer Review Policy Review Guideline Archiving	Author Guidelines Online Submission Author Fee / Article Publication Charge Plagiarism Policy Article withdrawal	For Readers For Authors Journal History

International Journal of Robotics and Control Systems
e-ISSN: 2775-2658
Website: https://pubs2.ascee.org/index.php/IJRCS
Email: ijrcs@ascee.org
Organized by: Association for Scientific Computing Electronics and Engineering (ASCEE), Peneliti Teknologi Teknik Indonesia, Department of Electrical Engineering, Universitas Ahmad Dahlan and Kuliah Teknik Elektro
Published by: Association for Scientific Computing Electronics and Engineering (ASCEE)
Office: Jalan Janti, Karangjambe 130B, Banguntapan, Bantul, Daerah Istimewa Yogyakarta, Indonesia

Username
Password
Remember me