Image Detection of Mycobacterium Tuberculosis Using a Combination of Deformable Features and Multi-Scale Attention
Zhou Mengli1, Zhong Mingen1*, Tan Jiawei2, Yuan Bingan2, Deng Zhiying1, Yang Kaibo1
1(Machine Vision and Artificial Intelligence Laboratory, School of Mechanical and Automotive Engineering, Xiamen University of Technology, Xiamen 361024, Fujian, China) 2(School of Aerospace Engineering, Xiamen University, Xiamen 361005, Fujian, China)
Abstract:Tuberculosis is a common, frequent and dangerous infectious disease. At present, sputum smears are mainly used for manual microscopic examination.Due to the characteristics of small scale, bacterial adhesion, and irregular morphology of TB bacteria in microscopic scenes,it is easy to cause missed and wrong detection. To this end, a deep learning technology based automatic detection algorithm MTDet for mycobacterium tuberculosis in sputum microscopic images was proposed in this paper. Firstly, a lightweight basic feature extraction network was constructed to capture the spatial relationships and individual local features of bacterial accumulation and adhesion in a global attention manner. Secondly, the self-designed deformable feature aggregation module DC2f and efficient multi-scale attention EMA were utilized to reconstruct features and adapt to the various forms of mycobacterium tuberculosis. Finally, a high-resolution branch was added to the detection head to enhance the model's perception of small targets. The experimental results on the publicly available dataset of mycobacterium tuberculosis microscopic images, tuberculosis phonecamera and ZNSM iDB showed that the algorithm had an average detection accuracy of 90.2% and 87.9%, respectively, and a recall rate of 84.1% and 83.2%, both exceeding existing mainstream algorithms. In addition, based on the WHO diagnostic criteria for tuberculosis, the comprehensive accuracy rate of 220 clinical samples was 96.8%, of which the false positive rate was 6.5% and the false negative rate was 0%, which was expected to help in the auxiliary diagnosis of tuberculosis.
周梦丽, 钟铭恩, 谭佳威, 袁彬淦, 邓智颖, 杨凯博. 联合可变形特征和多尺度注意力的结核杆菌图像检测[J]. 中国生物医学工程学报, 2025, 44(3): 301-311.
Zhou Mengli, Zhong Mingen, Tan Jiawei, Yuan Bingan, Deng Zhiying, Yang Kaibo. Image Detection of Mycobacterium Tuberculosis Using a Combination of Deformable Features and Multi-Scale Attention. Chinese Journal of Biomedical Engineering, 2025, 44(3): 301-311.
[1] Klann E, Beal SG, Tremblay EE. Evaluating differences in tuberculosis and nontuberculous mycobacterial lung disease in florida[J]. American Journal of Infection Control, 2019, 47 (11): 1324-1328. [2] 李明瑛,姚恒波,柴青峰,等. 液基细胞学涂片法检测痰抗酸杆菌对肺结核的诊断价值[J]. 新乡医学院学报,2019,36(4): 364-367. [3] 刘敏,李德辉,董锡阳. 肺结核诊断新进展[J]. 巴楚医学,2023,6(1): 22-26. [4] Bagcchi S. Who′s global tuberculosis report 2022[J]. The Lancet Microbe, 2023, 4(1): e20. [5] 阮真,朱鹏飞,张磊,等. 基于单细胞拉曼技术鉴定非结核分枝杆菌的方法研究[J]. 光谱学与光谱分析,2021,41(11): 3468-3473. [6] 宋有义,雷柏英,何亮,等. 基于超像素和支持向量机的阴道细菌自动检测[J]. 中国生物医学工程学报,2015,34(2): 204-211. [7] Soni A, Rai A, Ahirwar SK. Mycobacterium tuberculosis detection using support vector machine classification approach[C]//2021 10th IEEE International Conference on Communication Systems and Network Technologies. Bhopal: IEEE, 2021: 408-413. [8] 王旭. 基于颜色和形态特征的痰涂片图像结核菌识别方法研究[D]. 厦门:厦门大学,2014. [9] Lubis AR, Prayudani S, Fatmi Y, et al. Detection of hog features on tuberculosis x-ray results using svm and knn[C]//2021 2nd International Conference on Innovative and Creative Information Technology. Salatiga: IEEE, 2021: 25-29. [10] Lopez YP, Filho CFFC, Aguilera LMR, et al. Automatic classification of light field smear microscopy patches using convolutional neural networks for identifying mycobacterium tuberculosis[C]//Electrical, Electronics Engineering, Information & Communication Technologies. Pucon: IEEE, 2017:1-5. [11] 卞景帅,卢家品,罗月童,等. 基于Faster-RCNN的结核杆菌自动检测方法研究与应用[J]. 图学学报,2019,40(3): 608-615. [12] 张璇. 基于改进SSD网络的结核杆菌目标检测算法研究[D].重庆:重庆师范大学,2020. [13] Panicker RO, Sabu MK. Automatic detection of tuberculosis bacilli from conventional sputum smear microscopic images using densely connected convolutional networks[J]. Sn Computer Science, 2022, 3(4): 263. [14] 鞠孟汐,李欣蔚,李章勇. 基于深度主动学习的白带白细胞智能检测方法研究[J]. 生物医学工程学杂志,2020,37(3): 519-526. [15] Ouyang Daliang, He Su, Zhang Guozhong, et al. Efficient multi-scale attention module with cross-spatial learning[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes: IEEE, 2023: 1-5. [16] Quinn JA, Nakasi R, Mugagga PKB, et al. Deep convolutional neural networks for microscopy-based point of care diagnostics[C]//Machine Learning for Healthcare Conference. Los Angeles: PMLR, 2016: 271-281. [17] Shah MI, Mishra S, Yadav VK, et al. Ziehl–neelsen sputum smear microscopy image database: a resource to facilitate automated bacilli detection for tuberculosis diagnosis[J]. Journal of Medical Imaging, 2017, 4(2): 027503. [18] Varghese R, Sambath M. Yolov8: a novel object detection algorithm with enhanced performance and robustness[C]//2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems. Chennai: IEEE, 2024: 1-6. [19] Cai Han, Li Junyan, Hu Muyan, et al. Efficientvit: multi-scale linear attention for high-resolution dense prediction[EB/OL]. https://arxiv.org/abs/2205.14756, 2024-02-06/2024-02-27. [20] Nascimento MG, Fawcett R, Prisacariu VA. Dsconv: efficient convolution operator[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 5148-5157. [21] Wortsman M, Lee J, Gilmer J, et al. Replacing softmax with relu in vision transformers[EB/OL]. https://arxiv.org/abs/2309.08586,2023-10-17/2024-02-27. [22] Sandler M, Howard A, Zhu Menglong, et al. Mobilenetv2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4510-4520. [23] Dai Jifeng, Qi Haozhi, Xiong Yuwen, et al. Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 764-773. [24] 韩阳,宋金淼,薛安懿,等. 基于三重注意力的脑肿瘤图像分割网络[J]. 中国生物医学工程学报,2022,41(1): 57-63. [25] 纪建兵,陈纾,杨媛媛. 双重降维通道注意力门控U-Net的胰腺CT分割[J].中国生物医学工程学报,2023,42(3): 281-288. [26] Chen Jierun, Kao Shiuhong, He Hao, et al. Run, don't walk: chasing higher gflops for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 12021-12031. [27] Woo S, Debnath S, Hu Ronghang, et al. Convnext v2: co-designing and scaling convnets with masked autoencoders[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 16133-16142. [28] Li Yanyu, Hu Ju, Wen Yang, et al. Rethinking vision transformers for mobilenet size and speed[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 16889-16900. [29] Hu Jie, Shen Li, Sun Gang. Squeeze and excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE,2018: 7132-7141. [30] Hou Qibin, Zhou Daquan, Feng Jiashi. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual: IEEE, 2021: 13713-13722. [31] Zhang Qinglong, Yang Yubin. Sa-net: shuffle attention for deep convolutional neural networks[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Toronto: IEEE, 2021: 2235-2239. [32] Zhu Lei, Wang Xinjiang, Ke Zhanghan, et al. Biformer: vision transformer with bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 10323-10333. [33] Xu Bowen, Chen Mingtao, Guan Wenlong, et al. Efficient teacher: semi-supervised object detection for yolov5[EB/OL]. https://arxiv.org/abs/2302.07577,2023-03-14/2024-02-27. [34] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 213-229. [35] Wang Ao, Chen Hui, Lin Zijia, et al. Repvit: revisiting mobile cnn from vit perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 15909-15920. [36] Wang Chien-yao, Yeh I-hau, Liao Hong-yuanmark. Yolov9: learning what you want to learn using programmable gradient information[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 1-21. [37] Wang Ao, Chen Hui, Liu Lihao, et al. Yolov10: real-time end-to-end object detection[J]. Advances in Neural Information Processing Systems, 2024, 37: 107984-108011. [38] Lumb R, Vandeun A, Bastian I, et al. Laboratory Diagnosis of Tuberculosis by Sputum Microscopy: the Handbook[M]. Adelaide: Sa Pathology, 2013:1-84. [39] Selvaraju RR, Cogswell M, Das A, et al. Grad-cam: visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE,2017: 618-626.