|
|
Classification of Dysphonia in Parkinson′s Disease Based on Hierarchical Fractional Spectrogram |
Xue Zaifa1,2, Lu Huibin1,2, Lin Liqin3, Zhang Tao1,2* |
1(School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, Hebei, China) 2(Hebei Key Laboratory of Information Transmission and Signal Processing, Qinhuangdao 066004, Hebei, China) 3(Hardware Product R&D Center, Hangzhou Hikvision Digital Technology Co., Ltd., Hangzhou 310051, China) |
|
|
Abstract Dysphonia is one of the early symptoms of Parkinson′s disease. Most of the existing deep learning-based classifications of dysphonia in Parkinson′s disease are based on spectrogram and convolutional neural network, but both of them suffer from deficiencies such as single angle and restricted receptive field, respectively, which lead to insufficient information extraction. This paper proposed a classification method for Parkinson′s disease based on hierarchical fractional spectrogram. Firstly, by adding angle rotation factors, the dysphonia signal was transformed into the fractional spectrogram to enhance the ability of extracting energy information from different angles. Then the parameters of the Swin Transformer network pre-trained on ImageNet were transferred and fine-tuned to solve the problem of small data size. Finally, the combination of hierarchical structure and shifted window-based self-attention mechanism expanded the receptive field and realized multi-scale information fusion, which effectively improved the Parkinson′s disease classification accuracy. The results on Database-1 (240 samples collected by the Department of Neurology of Medicine, Istanbul University) and Database-2 (1 404 samples collected by Tangshan Workers′ Hospital and Kailuan Mental Health Center) showed good stability of the proposed method and achieved accuracy of 97.80 % and 98.75 % on the two datasets, respectively, with better performance than all compared advanced methods. Our proposed method provides a new perspective for analyzing articulation disorders in Parkinson′s disease.
|
Received: 11 January 2024
|
|
Corresponding Authors:
*E-mail: zhtao@ysu.edu.cn
|
|
|
|
[1] Karaman O, Cakin H, Alhudhaif A, et al. Robust automated Parkinson disease detection based on voice signals with transfer learning [J]. Expert Systems with Application, 2021, 178(15): e115013. [2] 李润泽, 杨硕, 冯珂珂, 等. rTMS靶向刺激对帕金森病运动症改善作用的研究进展 [J]. 中国生物医学工程学报, 2023, 42(3): 345-352. [3] Diogo B, Ana MM, Luis A, et al. Automatic detection of Parkinson′s disease based on acoustic analysis of speech [J]. Engineering Applications of Artificial Intelligence, 2019, 77(1): 148-158. [4] Tsanas A, Little MA, Mcsharry PE, et al. Novel speech signal processing algorithms for high-accuracy classification of Parkinson′s disease [J]. IEEE Transactions on Biomedical Engineering, 2012, 59(5): 1264-1271. [5] Hammami I, Salhi L, Labidi S. Voice pathologies classification and detection using EMD-DWT analysis based on higher order statistic features [J]. IRBM, 2020, 41(3): 161-171. [6] Kacha A, Grenez F, Orozco-Arroyave JR, et al. Principal component analysis of the spectrogram of the speech signal: Interpretation and application to dysarthric speech [J]. Computer Speech & Language, 2020, 59(1): 114-122. [7] Kodrasi I, Bourlard H. Spectro-temporal sparsity characterization for dysarthric speech detection [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28(1): 1210-1222. [8] Goyal J, Khandnor P, Aseri TC. A hybrid approach for Parkinson′s disease diagnosis with resonance and time-frequency based features from speech signals [J]. Expert Systems with Application, 2021, 182(15): e115283. [9] Karan B, Sahu SS, Orozco-Arroyave JR, et al. Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson′s disease prediction [J]. Computer Speech & Language, 2021, 69(1): e101216. [10] Xue Zaifa, Zhang Tao, Lin Liqin. Progress prediction of Parkinson′s disease based on graph wavelet transform and attention weighted random forest [J]. Expert Systems with Applications, 2022, 203(1): e117483. [11] Sakar CO, Serbes G, Gunduz A, et al. A comparative analysis of speech signal processing algorithms for Parkinson′s disease classification and the use of the tunable Q-factor wavelet transform [J]. Applied Soft Computing, 2019, 74(1): 255-263. [12] Karan B, Sahu SS, Orozco-Arroyave JR, et al. Hilbert spectrum analysis for automatic detection and evaluation of Parkinson′s speech [J]. Biomedical Signal Processing and Control, 2020, 61(1): e102050. [13] Karan B, Sahu SS. An improved framework for Parkinson′s disease prediction using variational mode decomposition-Hilbert spectrum of speech signal [J]. Biocybernetics and Biomedical Engineering, 2021, 41(2): 717-732. [14] Zhang Tao, Zhang Yajuan, Sun Hao, et al. Parkinson disease detection using energy directionfeatures based on EMD from voice signal [J]. Biocybernetics and Biomedical Engineering, 2020, 41(1): 127-141. [15] 张涛, 蒋培培, 张亚娟, 等. 基于时频混合域局部统计的帕金森病语音障碍分析方法研究 [J]. 生物医学工程学杂志, 2021, 38(1): 21-29. [16] 张涛, 蒋培培, 李林, 等. 基于偏序拓扑图的帕金森病语音障碍分析方法 [J]. 中国生物医学工程学报, 2019, 38(1): 62-72. [17] Guatelli R, Aubin V, Mora M, et al. Detection of Parkinson′s disease based on spectrograms of voice recordings and extreme learning machine random weight neural networks [J]. Engineering Applications of Artificial Intelligence, 2023, 125: 106700. [18] Celik G, Basaran E. Proposing a new approach based on convolutional neural networks and random forest for the diagnosis of Parkinson′s disease from speech signals [J]. Applied Acoustics, 2023, 211: 109476. [19] Hireš M, Gazda M, Drotár P, et al. Convolutional neural network ensemble for Parkinson′s disease detection from voice recordings [J]. Computers in Biology and Medicine, 2021, 141(1): e105021. [20] Zhang Tao, Zhang Yajuan, Cao Yuyang, et al. Diagnosing Parkinson′s disease with speech signal based on convolutional neural network [J]. International Journal of Computer Applications in Technology, 2020, 63(4): 348-353. [21] Liu Ze, Lin Yutong, Cao Yue, et al. Swin Transformer: Hierarchical vision transformer using shifted windows [C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021: 10012-10022. [22] 张小恒, 张馨月, 李勇明, 等. 面向帕金森病语音诊断的非监督两步式卷积稀疏迁移学习算法 [J]. 电子学报, 2022, 50(1): 177-184. [23] Rezaee K, Savarkar S, Yu Xiaofeng, et al. A hybrid deep transfer learning-based approach for Parkinson′s disease classification in surface electromyography signals [J]. Biomedical Signal Processing and Control, 2022, 71(1): e103161. [24] Sakar BE, Isenkul MM, Sakar CO, et al. Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings [J]. IEEE Journal of Biomedical & Health Informatics, 2013, 17(4): 828-834. [25] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is worth 16x16 words: Transformers for image recognition at scale [C] // Proceedings of International Conference on Learning Representations. Ithaca: ICLR, 2021: 1-21. [26] Benba A, Jilbab A, Hammouch A. Analysis of multiple types of voice recordings in cepstral domain using MFCC for discriminating between patients with Parkinson′s disease and healthy people [J]. International Journal of Speech Technology, 2016, 19(3): 449-456. [27] Benba A, Jilbab A, Hammouch A. Using human factor cepstral coefficient on multiple types of voice recordings for detecting patients with Parkinson′s disease [J]. IRBM, 2017, 38(6): 346-351. [28] 张文清. 帕金森病语音障碍深度学习的解码研究 [D]. 秦皇岛:燕山大学, 2019. [29] Xue Zaifa, Lu Huibin, Zhang Tao, et al. Remote Parkinson′s disease severity prediction based on causal game feature selection [J]. Expert Systems With Applications, 2024, 241: 122690. [30] Tian Lin. Seismic spectral decomposition using short-time fractional Fourier transform spectrograms [J]. Journal of Applied Geophysics, 2021, 192:104400. [31] Xu Liyun, Zhang Tong. Fractional feature-based speech enhancement with deep neural network [J]. Speech Communication, 2023, 153:102971. [32] Nam H, Park Y. Coherence-based phonemic analysis on the effect of reverberation to practical automatic speech recognition [J]. Applied Acoustics, 2025, 227:110233. [33] Nour M, Senturk U, Polat K. Diagnosis and classification of Parkinson′s disease using ensemble learning and 1D-PDCovNN [J]. Computers in Biology and Medicine, 2023, 161:107031. [34] Liu Zhaoshan, Lv Qiujie, Yang Ziduo, et al. Recent progress in transformer-based medical image analysis [J]. Computers in Biology and Medicine, 2023, 164:107268. [35] Lu Xiaozhu, Song Lingnan, Xu Hui, et al. Single sample electromagnetic spectrum recognition utilizing fractional Fourier transform [J]. Chinese Journal of Aeronautics, 2024, 37(11): 435-446. [36] Shen Kailai, Yan Diqun, Dong Li. MSQAT:a multi-dimension non-intrusive speech quality assessment transformer utilizing self-supervised representations [J]. Applied Acoustics, 2023, 212:109584. [37] Ilyas H, Javed A, Malik K. AVFakeNet: A unified end-to-end Dense Swin Transformer deep learning model for audio-visual deepfakes detection [J]. Applied Soft Computing, 2023, 136:110124. [38] Zhou Chenyang, Liu Xueyu, Liang Shaohua, et al. Swin Transformer based detection and segmentation networks for measurement and quantification analysis of arteriolar vessels from renal whole slide images [J]. Biomedical Signal Processing and Control, 2024, 96:106619. |
[1] |
Li Ziyun, Li Jiping, Wei Jing. Advances in Electrophysiological Research of Spinal Cord Stimulation on Freezing of Gaitin Parkinson′s Disease[J]. Chinese Journal of Biomedical Engineering, 2023, 42(5): 603-609. |
[2] |
Li Runze, Yang Shuo, Feng Keke, Wang Alan, Tian Shuxiang, Yin Shaoya, Xu Guizhi. Research Progress on Potential Brain Stimulation Targets of rTMS for Alleviating Motor Symptoms in Parkinson's Disease[J]. Chinese Journal of Biomedical Engineering, 2023, 42(3): 345-352. |
|
|
|
|