Abstract:Deep learning theory has been widely used in the video analysis of minimally invasive surgery,and has made remarkable achievements in the surgical tool detection and tracking,surgical tool presence detection and workflow recognition of minimally invasive surgery. In the long run,the detailed analysis of the video content of minimally invasive surgery can not only automatically identify the ongoing surgical tasks,but also be used to remind clinicians of possible complications. In recent years,with the continuous development of technology,the application of deep learning in the video analysis of minimally invasive surgery has made great progress. This paper firstly expounded the significance,difficulties and the relevant technical content of video analysis of minimally invasive surgery,mainly introducing the advantages based on deep learning algorithm. This paper also summarized the research achievements of deep learning in the field of surgical tool detection and tracking,surgical tool presence detection and workflow recognition of minimally invasive surgery,classified and concluded algorithms based on their features in different fields of minimally invasive surgery video analysis,and evaluated their results. In the end,this paper summarized and looked forward to the future development of minimally invasive surgery video analysis.
史攀, 赵子健. 深度学习在微创手术视频分析中的应用研究综述[J]. 中国生物医学工程学报, 2020, 39(4): 473-484.
Shi Pan, Zhao Zijian. Review on the Application of Deep Learningin Video Analysis of Minimally Invasive Surgery. Chinese Journal of Biomedical Engineering, 2020, 39(4): 473-484.
[1] Zhao Zijian,Voros S,Weng Ying,et al.Tracking-by-detection of surgical instruments in minimally invasive surgery via the convolutional neural network deep learning-based method[J].Computer Assisted Surgery,2017,22(1):26-35. [2] Cortes C,Vapnik V.Support-vector networks[J].Machine Learning,1995,20(3):273-297. [3] Schapire RE.Theoretical views of boosting and applications[C]//The Proceedings of 10th International Conference on Algorithmic Learning Theory (ALT 99).Berlin:Springer,1999:13-25. [4] Krishnapuram B,Carin L,Mário AT,et al.Sparse multinomial logistic regression:Fast algorithms and generalization bounds[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(6):957-968. [5] Kranzfelder M,Schneider A,Fiolka A,et al.Real-time instrument detection in minimally invasive surgery using radiofrequency identification technology[J].Journal of Surgical Research,2013,185(2):704-710. [6] Gao Cong,Unberath M,Taylor RH,et al.Localizing dexterous surgical tools in X-ray for image-based navigation[DB/OL].https://arxiv.org/abs/1901.06672,2019-07-04/2020-01-13. [7] Choi B,Jo K,Choi S,et al.Surgical-tools detection based on convolutional neural network in laparoscopic robot-assisted surgery[C]//International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).Seogwipo:IEEE,2017:1756-1759. [8] Sarikaya D,Corso JJ,Guru KA,et al.Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection[J].IEEE Transactions on Medical Imaging,2017,36(7):1542-1549. [9] Jin A,Yeung S,Jopling J,et al.Tool detection and operative skill assessment in surgical videos using region based convolutional neural networks[C]//IEEE Winter Conference on Applications of Computer Vision (WACV).Lake Tahoe:IEEE,2018:691-699. [10] Kurmann T,Neila PM,Du X,et al.Simultaneous recognition and pose estimation of instruments in minimally invasive surgery[C]//International Conference on Medical Image Computing &Computer-Assisted Intervention.Cham:Springer-Verlag,2017:505-513. [11] Du Xiaofei,Kurmann T,Chang Pinglin,et al.Articulated multi-instrument 2D pose estimation using fully convolutional networks[J].IEEE Transactions on Medical Imaging,2018,37(5):1276-1287. [12] Laina I,Rieke N,Rupprecht C,et al.Concurrent segmentation and localization for tracking of surgical instruments[C]//International Conference on Medical Image Computing &Computer-assisted Intervention.Cham:Springer-Verlag,2017:664-672. [13] Ni Zhenliang,Bian Guibin,Xie Xiaofei,et al.RASNet:Segmentation for tracking surgical instruments in surgical videos using refined attention segmentation network[DB/OL].https://arxiv.org/abs/1905.08663,2019-09-19/2020-01-13. [14] Colleoni E,Moccia S,Du Xiaofei,et al.Deep learning based robotic tool detection and articulation estimation with spatio-temporal layers[J].IEEE Robotics and Automation Letters,2019,4(3):2377-3766. [15] Hajj HA,Lamard M,Charrière K,et al.Surgical tool detection in cataract surgery videos through multi-image fusion inside a convolutional neural network[C]//International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).Seogwipo:IEEE,2017:2002-2005. [16] Garcia-Peraza-Herrera LC,Li Wenqi,Gruijthuijsen C,et al.Real-time segmentation of non-rigid surgical tools based on deep learning and tracking[C]//International Workshop on Computer-Assisted and Robotic Endoscopy.Cham:Springer-Verlag,2016:84-95. [17] Chen Zhaorui,Zhao Zijian,Cheng Xiaolin.Surgical instruments tracking based on deep learning with lines detection and spatio-temporal context[C]//Chinese Automation Congress (CAC).Jinan:IEEE,2017:2711-2714. [18] Mishra K,Sathish R,Sheet D.Tracking of retinal microsurgery tools using late fusion of responses from convolutional neural network over pyramidally decomposed frames[C]//International Conference on Computer Vision.Cham:Springer-Verlag,2017:358-366. [19] Zhao Zijian,Voros S,Chen Zhaorui,et al.Surgical tool tracking based on two CNNs:From coarse to fine[J].Journal of Engineering,2019,14(1):467-472. [20] Zhao Zijian,Chen Zhaorui,Voros S,et al.Real-time tracking of surgical instruments based on spatio-temporal context and deep learning[J].Computer Assisted Surgery,2019,24(1):20-29. [21] Vardazaryan A,Mutter D,Marescaux J,et al.Weakly-supervised learning for tool localization in laparoscopic videos[C]//International Conference on Medical Image Computing &Computer-Assisted Intervention.Cham:Springer-Verlag,2018:169-179. [22] Nwoye CI,Mutter D,Marescaux J,et al.Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos[J].International Journal of Computer Assisted Radiology and Surgery,2019,14(6):1059-1067. [23] Sznitman R,Becker C,Fua P.Fast part-based classification for instrument detection in minimally invasive surgery[C]//International Conference on Medical Image Computing &Computer-Assisted Intervention.Cham:Springer-Verlag,2014:692-699. [24] Sahu M,Mukhopadhyay A,Szengel A,et al.Tool and phase recognition using contextual CNN features[DB/OL].https://arxiv.org/abs/1610.08854,2016-10-27/2020-01-13. [25] Sahu M,Mukhopadhyay A,Szengel A,et al.Addressing multi-label imbalance problem of surgical tool detection using CNN[J].Computer Assisted Radiology and Surgery,2017,12(6):1013-1020. [26] Twinanda AP,Shehata SM,MutterD,et al.EndoNet:A deep architecture for recognition tasks on laparoscopic videos[J].IEEE Transactions on Medical Imaging,2017,36(1):86-97. [27] Wang S,Raju A,Huang J.Deep learning based multi-label classification for surgical tool presence detection in laparoscopic videos[C]//IEEE 14th International Symposium on Biomedical Imaging.Melbourne:IEEE,2017:620-623. [28] Hu Xiaowei,Yu Lequan,Chen Hao,et al.AGNet:Attention-guided network for surgical tool presence detection[C]//International Workshop on Deep Learning in Medical Image Analysis International Workshop on Multimodal Learning for Clinical Decision Support.Cham:Springer-Verlag,2017:186-194. [29] Wang Sheng,Xu Zheng,Yan Chaochao,et al.Graph convolutional nets for tool presence detection in surgical videos[C]//International Conference on Information Processing in Medical Imaging.Cham:Springer-Verlag,2019:467-478. [30] Mishra K,Sathish R,Sheet D,et al.Learning latent temporal connectionism of deep residual visual abstractions for identifying surgical tools in laparoscopy procedures[C]//IEEE Conference on Computer Vision &Pattern Recognition Workshops.Honolulu:IEEE Computer Society,2017:2233-2240. [31] Namazi B,Sankaranarayanan G,Devarajan V.LapTool-Net:A contextual detector of surgical tools in laparoscopic videos based on recurrent convolutional neural networks[DB/OL].https://arxiv.org/abs/1905.08983,2019-05-22/2020-01-13. [32] Jin Yueming,Dou Qi,Chen Hao,et al.SV-RCNet:Workflow recognition from surgical videos using recurrent convolutional network[J].IEEE Transactions on Medical Imaging,2018,37(5):1114-1126. [33] Zisimopoulos O,Flouty E,Luengo I,et al.DeepPhase:Surgical phase recognition in cataracts videos[C]//International Conference on Medical Image Computing &Computer-assisted Intervention.Cham:Springer-Verlag,2018:265-272. [34] Hajj HA,Lamard M,Conze P,et al.Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks[J].Medical Image Analysis,2018,47:203-218. [35] Babak N,Ganesh S,Venkat D.Automatic detection of surgical phases in laparoscopic videos[C]//Proceedings on the International Conference on Artificial Intelligence (ICAI).Athens:World Comp,2018:124-130. [36] Mondal S,Sathish R,Sheet D.Multitask learning of temporal connectionism in convolutional networks using a joint distribution loss function to simultaneously identify tools and phase in surgical videos[DB/OL].https://arxiv.org/abs/1905.08315,2019-05-25/2020-01-13. [37] Nakawala H,Bianchi R,Pescatori L,et al.“Deep-Onto” network for surgical workflow and context recognition[J].International Journal of Computer Assisted Radiology and Surgery,2019,14(4):685-696. [38] Bodenstedt S,Wagner M,Katic D,et al.Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis[DB/OL].https://arxiv.org/abs/1702.0 3684,2017-02-13/2020-01-13. [39] Funke I,Jenke A,Sören TM,et al.Temporal coherence-based self-supervised learning for laparoscopic workflow analysis[DB/OL].https://arxiv.org/abs/1806.06811,2018-09-07/2020-01-13. [40] Yengera G,Mutter D,Marescaux J,et al.Less is more:Surgical phase recognition with less annotations through self-supervised pre-training of CNN-LSTM networks[DB/OL].https://arxiv.org/abs/1805.08569,2018-05-22/2020-01-13. [41] Twinanda AP.Vision-based approaches for surgical activity recognition using laparoscopic and RBGD videos[D].Strasbourg:Universit′e de Strasbourg,2017. [42] Bouget D,Allan M,Stoyanov D,et al.Vision-based and marker-less surgical tool detection and tracking:A review of the literature[J].Medical Image Analysis,2017,35:633-654. [43] Esteva A,Robicquet A,Ramsundar B,et al.A guide to deep learning in healthcare[J].Nature Medicine,2019,25 (1):24-29. [44] Wainberg M,Merico D,Delong A,et al.Deep learning in biomedicine[J].Nature Biotechnology,2018,36(9):829-838.