Abstract:Automatic segmentation of medical images has extensive and important clinical application value, especially the automatic segmentation of lesions and organs. The medical image segmentation based on conventional image processing methods can only utilize shallow features extracted by shallow structure model to identify the regions of interest and requires a lot of manual intervention. However, the segmentation methods based on the machine learning have limitations and lack of interpretability in modeling. This paper presented a 3D medical image segmentation method based on Transformer and convolutional neural network (CNN) combined with morphological structure constraints. In the encoder, the CNN and Transformer were used to construct a U-shaped network structure to extract various features; and in the decoder, the up-sampling operation was used and the features of different levels were concatenated together by skip-connections. The morphological structure constraint module was addedto enhance the interpretability of the modelthrough extracting the shape information of segmented targets such as lesions and organs, and the maximum pooling and average pooling operations were used to further extract representative features from the results obtained through the CNN as the input of the morphological structure moduleand improved the accuracy of the final segmentation results. The evaluation indexes DSC and HD were used to verify the effectiveness of the proposed algorithm on the public datasets Synapse and ACDC. On the Synapse dataset, 18 cases of data were used as the training set and 12 cases of data were used as the test set; on the ACDC dataset, 70 cases of data were used as the training set, 10 cases of data were used as the validation set and 20 cases of data were used as the test set. The experimental results showed that the average value of DSC and HD of different optimizers reached 76.67% and 25.18 mm (SDG) and 82.80% and 21.07 mm (Adam) on Synapse respectively, and the average value of DSC and HD of different optimizers reached 90.65% (SDG) and 91.75% (Adam) on ACDC respectively. Compared with other methods, the proposed method has shown certain advantages. The results showed that the proposed method improved the wrong-segmentation problems in the 3D medical image segmentation, and enhanced the performance of image segmentation task.
李军, 叶欣怡, 杨长才, 陈秋凤, 薛岚燕, 魏丽芳. 基于联合深度网络和形态结构约束的三维医学图像分割方法[J]. 中国生物医学工程学报, 2023, 42(1): 30-40.
Li Jun, Ye Xinyi, Yang Changcai, Chen Qiufeng, Xue Lanyan, Wei Lifang. 3D Medical Image Segmentation Based on Joined Depth Network and Morphological Structure Constraints. Chinese Journal of Biomedical Engineering, 2023, 42(1): 30-40.
[1] 田娟秀,刘国才,谷珊珊,等.医学图像分析深度学习方法研究与挑战[J]. 自动化学报, 2018, 44(3): 401-424. [2] Polańczyk A, Strzelecki M, Woźniak T, et al. 3D blood vessels reconstruction based on segmented CT data for further simulations of hemodynamic in human artery branches [J]. Foundations of Computing and Decision Sciences, 2017, 42(4): 359-371. [3] 曹玉红,徐海,刘荪傲,等. 基于深度学习的医学影像分割研究综述 [J]. 计算机应用, 2021, 41(8): 2273-2287. [4] 施俊,汪琳琳,王珊珊,等. 深度学习在医学影像中的应用综述 [J]. 中国图象图形学报, 2020, 25(10): 1953-1981. [5] 王琮智,许梓璧,马祥园,等. 基于数据扩增和迁移学习的Mask R-CNN脑CT图像自动分割研究[J]. 中国生物医学工程学报, 2021, 40(4): 410-418. [6] Kumar N, Verma R, Sharma S, et al. A dataset and a technique for generalized nuclear segmentation for computational pathology [J]. IEEE Transactions on Medical Imaging, 2017, 36(7): 1550-1560. [7] Xu Yan, Jia Zhipeng, Ai Yuqing, et al. Deep convolutional activation features for large scale brain tumor histopathology image classification and segmentation [C]//International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brisbane: IEEE, 2015: 947-951. [8] Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich: Springer, 2015: 234-241. [9] Jesses S, Fatemeh D, Mark Z, et al. Saunet: shape attentive u-net for interpretable medical image segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Lima: Springer, 2020: 797-806. [10] Chaurasia A, Culurciello E. Linknet: exploiting encoder representations for efficient semantic segmentation [C]//2017 IEEE Visual Communications and Image Processing (VCIP). St. Petersburg: IEEE, 2017:1-4. [11] Zhou Zongwei, Siddiquee MMR, Tajbakhsh N, et al. Unet++: redesigning skip connections to exploit multiscale features in image segmentation [J]. IEEE Transactions on Medical Imaging, 2020, 39(6): 1856-1867. [12] Huang Huimin, Lin Lanfen, Tong Ruofeng, et al. Unet 3+: a full-scale connected unet for medical image segmentation [C]//International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona: IEEE, 2020: 1055-1059. [13] Oktay O, Schlemper J, Folgoc L, et al. Attention u-net: learning where to look for the pancreas [DB/OL]. https://arxiv.org/abs/1804.03999, 2018-05-20/2022-12-01. [14] Schlemper J, Oktay O, Schaap M, et al. Attention gated networks: learning to leverage salient regions in medical images [J]. Medical Image Analysis, 2019, 53: 197-207. [15] Wang Xiaolong, Girshick RB, Gupta AK, et al. Non-local neural networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7794-7803. [16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]//31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach: Curran Associates, 2017: 5998-6008. [17] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale [DB/OL]. https://arxiv.org/abs/2010.11929, 2021-07-03/2022-12-01. [18] Liu Ze, Lin Yutong, Cao Yue, et al. Swin transformer: hierarchical vision transformer using shifted windows [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 10012-10022. [19] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [20] Tuli S, Dasgupta I, Grant E, et al. Are convolutional neural networks or transformers more like human vision? [DB/OL]. https://arxiv.org/abs/2105.07197, 2021-07-01/2022-12-01. [21] Chen Jieneng, Lu Yongyi, Yu Qihang, et al. Transunet: transformers make strong encoders for medical image segmentation [DB/OL]. https://arxiv.org/abs/2102.04306, 2021-02-08/2022-12-01. [22] Zhang Yundong, Liu Huiye, Hu Qiang. Transfuse: fusing transformers and cnns for medical image segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Strasbourg: Springer, 2021: 14-24. [23] Xu Guoping, Wu Xingrong, Zhang Xuan, et al. Levit-unet: make faster encoders with transformer for medical image segmentation [DB/OL]. https://arxiv.org/abs/2107.08623, 2021-07-19/2022-12-01. [24] Takikawa T, Acuna D, Jampani V, et al. Gated-scnn: gated shape CNNS for semantic segmentation [C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 5228-5237. [25] Wang Hongyi, Xie Shiao, Lin Lanfen, et al. Mixed transformer U-net for medical image segmentation [C]//International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore: IEEE, 2022:2390-2394. [26] Fu Shuhao, Lu Yongyi, Wang Yan, et al. Domain adaptive relational reasoning for 3D multi-organ segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Lima: Springer, 2020: 656-666. [27] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions [DB/OL]. https://arxiv.org/abs/1511.07122, 2016-04-30/2022-12-01. [28] Li L, Lian S, Luo Z, et al. Learning consistency- and discrepancy-context for 2d organ segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Strasbourg: Springer, 2021: 261-270. [29] Chen Jianxu, Yang Lin, Zhang Yizhe, et al. Combining fully convolutional and recurrent neural networks for 三维 biomedical image segmentation [DB/OL]. https://arxiv.org/abs/1609.01006, 2016-09-06/2022-12-01. [30] Clough JR, Byrne N, Oksuz I, et al. A Topological loss function for deep-learning based image segmentation using persistent homology[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8766-8778. [31] Zhang Xingxuan, Cui Peng, Xu Renzhe, et al. Deep stable learning for out-of-distribution generalization [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Kuala Lumpur: IEEE, 2021:5372-5382. [32] Zhou Kaiyang, Liu Ziwei, Qiao Yu, et al. Domain generalization: a survey [DB/OL]. https://arxiv.org/abs/2103.02503, 2022-08-12/2022-12-18.