CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

Nouhaila BOUALOULOU

nouhailaboualoulou21@gmail.com
Laboratory Electrical and Industrial Engineering, Information Processing, Informatics, and Logistics (GEITIIL). (Morocco)

Taoufiq BELHOUSSINE DRISSI


Laboratory Electrical and Industrial Engineering, Information Processing, Informatics, and Logistics (Morocco)
https://orcid.org/0000-0003-2958-070X

Benayad NSIRI


Research Center STIS, M2CS, National Higher School of Arts and Craft, Rabat (ENSAM) (Morocco)
https://orcid.org/0000-0003-3885-9534

Abstract

Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC.


Keywords:

Parkinson's disease; voice signal; GTCC, MFCC; DWT; EMD; CNN and LSTM.

Ali, Z., Elamvazuthi, I., Alsulaiman, M., & Muhammad, G. (2016). Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model. Journal of Voice, 30(6), 757.e7-757.e19. https://doi.org/10.1016/j.jvoice.2015.08.010
DOI: https://doi.org/10.1016/j.jvoice.2015.08.010   Google Scholar

Altuve, M., Suárez, L., & Ardila, J. (2020). Fundamental heart sounds analysis using improved complete ensemble EMD with adaptive noise. Biocybernetics and Biomedical Engineering, 40(1), 426–439. https://doi.org/10.1016/j.bbe.2019.12.007
DOI: https://doi.org/10.1016/j.bbe.2019.12.007   Google Scholar

Dash, T. K., Mishra, S., Panda, G., & Satapathy, S. C. (2021). Detection of COVID-19 from speech signal using bio-inspired based cepstral features. Pattern Recognition, 117. https://doi.org/10.1016/j.patcog.2021.107999
DOI: https://doi.org/10.1016/j.patcog.2021.107999   Google Scholar

Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 357–366. https://doi.org/10.1109/TASSP.1980.1163420
DOI: https://doi.org/10.1109/TASSP.1980.1163420   Google Scholar

Demir, F., Siddique, K., Alswaitti, M., Demir, K., & Sengur, A. (2022). A Simple and Effective Approach Based on a Multi-Level Feature Selection for Automated Parkinson’s Disease Detection. Journal of Personalized Medicine, 12(1). https://doi.org/10.3390/jpm12010055
DOI: https://doi.org/10.3390/jpm12010055   Google Scholar

Drissi, T. B., Zayrit, S., Nsiri, B., & Ammoummou, A. (2019). Diagnosis of Parkinson’s disease based on wavelet transform and Mel Frequency Cepstral Coefficients. International Journal of Advanced Computer Science and Applications, 10(3), 125–132. https://doi.org/10.14569/IJACSA.2019.0100315
DOI: https://doi.org/10.14569/IJACSA.2019.0100315   Google Scholar

Er, M. B., Isik, E., & Isik, I. (2021). Parkinson’s detection based on combined CNN and LSTM using enhanced speech signals with Variational mode decomposition. Biomedical Signal Processing and Control, 70. https://doi.org/10.1016/j.bspc.2021.103006
DOI: https://doi.org/10.1016/j.bspc.2021.103006   Google Scholar

Grossmann, A., Morlet, J., & Paul, T. (1985). Transforms associated to square integrable group representations. I. General results. Journal of Mathematical Physics, 26(10), 2473–2479. https://doi.org/10.1063/1.526761
DOI: https://doi.org/10.1063/1.526761   Google Scholar

Hammami, I., Salhi, L., & Labidi, S. (2020). Voice Pathologies Classification and Detection Using EMD-DWT Analysis Based on Higher Order Statistic Features. IRBM, 41(3), 161–171. https://doi.org/10.1016/j.irbm.2019.11.004
DOI: https://doi.org/10.1016/j.irbm.2019.11.004   Google Scholar

Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Yen, N.-C., Tung, C. C., & Liu, H. H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 454(1971), 903–995. https://doi.org/10.1098/rspa.1998.0193
DOI: https://doi.org/10.1098/rspa.1998.0193   Google Scholar

Karan, B., Sahu, S. S., & Mahto, K. (2020). Parkinson disease prediction using intrinsic mode function based features from speech signal. Biocybernetics and Biomedical Engineering, 40(1), 249–264. https://doi.org/10.1016/j.bbe.2019.05.005
DOI: https://doi.org/10.1016/j.bbe.2019.05.005   Google Scholar

Karan, B., Sahu, S. S., Orozco-Arroyave, J. R., & Mahto, K. (2020). Hilbert spectrum analysis for automatic detection and evaluation of Parkinson’s speech. Biomedical Signal Processing and Control, 61, 102050. https://doi.org/10.1016/j.bspc.2020.102050
DOI: https://doi.org/10.1016/j.bspc.2020.102050   Google Scholar

Karan, B., & Sekhar Sahu, S. (2021). An improved framework for Parkinson’s disease prediction using Variational Mode Decomposition-Hilbert spectrum of speech signal. Biocybernetics and Biomedical Engineering, 41(2), 717–732. https://doi.org/10.1016/j.bbe.2021.04.014
DOI: https://doi.org/10.1016/j.bbe.2021.04.014   Google Scholar

Kethireddy, R., Kadiri, S. R., & Gangashetty, S. V. (2022). Exploration of temporal dynamics of frequency domain linear prediction cepstral coefficients for dialect classification. Applied Acoustics, 188. https://doi.org/10.1016/j.apacoust.2021.108553
DOI: https://doi.org/10.1016/j.apacoust.2021.108553   Google Scholar

López-Pabón, F. O., Arias-Vergara, T., & Orozco-Arroyave, J. R. (2020). Cepstral Analysis and Hilbert-Huang Transform for Automatic Detection of Parkinson’s Disease. TecnoLógicas, 23(47), 93–108. https://doi.org/10.22430/22565337.1401
DOI: https://doi.org/10.22430/22565337.1401   Google Scholar

Mondal, A., Banerjee, P., & Tang, H. (2018). A novel feature extraction technique for pulmonary sound analysis based on EMD. Computer Methods and Programs in Biomedicine, 159, 199–209. https://doi.org/10.1016/j.cmpb.2018.03.016
DOI: https://doi.org/10.1016/j.cmpb.2018.03.016   Google Scholar

Moro-Velázquez, L., Gómez-García, J. A., & Godino-Llorente, J. I. (2016). Voice pathology detection using modulation spectrum-optimized metrics. Frontiers in Bioengineering and Biotechnology, 4(JAN). https://doi.org/10.3389/fbioe.2016.00001
DOI: https://doi.org/10.3389/fbioe.2016.00001   Google Scholar

Nagarajan, S., Nettimi, S. S. S., Kumar, L. S., Nath, M. K., & Kanhe, A. (2020). Speech emotion recognition using cepstral features extracted with novel triangular filter banks based on bark and ERB frequency scales. Digital Signal Processing, 104, 102763. https://doi.org/10.1016/j.dsp.2020.102763
DOI: https://doi.org/10.1016/j.dsp.2020.102763   Google Scholar

Najnin, S., & Banerjee, B. (2019). Speech recognition using cepstral articulatory features. Speech Communication, 107, 26–37. https://doi.org/10.1016/j.specom.2019.01.002
DOI: https://doi.org/10.1016/j.specom.2019.01.002   Google Scholar

Nouhaila, B., Taoufiq, B. D., & Benayad, N. (2022). An Intelligent Approach based on the Combination of the Discrete Wavelet Transform, Delta Delta MFCC for Parkinson’s Disease Diagnosis. International Journal of Advanced Computer Science and Applications, 13(4), 562–571. https://doi.org/10.14569/IJACSA.2022.0130466
DOI: https://doi.org/10.14569/IJACSA.2022.0130466   Google Scholar

Orozco-Arroyave, J. R., Arias-Londõ No, J. D., Vargas-Bonilla, J. F., González-Rátiva, M. C., & Nöth, E. (n.d.). New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease.
  Google Scholar

Oyinloye, B. E., Iwaloye, O., & Ajiboye, B. O. (2021). Polypharmacology of Gongronema latifolium leaf secondary metabolites against protein kinases implicated in Parkinson’s disease and Alzheimer’s disease. Scientific African, 12. https://doi.org/10.1016/j.sciaf.2021.e00826
DOI: https://doi.org/10.1016/j.sciaf.2021.e00826   Google Scholar

Qin, J., Liu, T., Wang, Z., Zou, Q., Chen, L., & Hong, C. (2022). Speech Recognition for Parkinson’s Disease Based on Improved Genetic Algorithm and Data Enhancement Technology (pp. 273–286). https://doi.org/10.1007/978-981-19-5194-7_21
DOI: https://doi.org/10.1007/978-981-19-5194-7_21   Google Scholar

Quan, C., Ren, K., Luo, Z., Chen, Z., & Ling, Y. (2022). End-to-end deep learning approach for Parkinson’s disease detection from speech signals. Biocybernetics and Biomedical Engineering, 42(2), 556–574. https://doi.org/10.1016/j.bbe.2022.04.002
DOI: https://doi.org/10.1016/j.bbe.2022.04.002   Google Scholar

Sakar, B. E., Isenkul, M. E., Sakar, C. O., Sertbas, A., Gurgen, F., Delil, S., Apaydin, H., & Kursun, O. (2013). Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE Journal of Biomedical and Health Informatics, 17(4), 828–834. https://doi.org/10.1109/JBHI.2013.2245674
DOI: https://doi.org/10.1109/JBHI.2013.2245674   Google Scholar

Sakar, C. O., Serbes, G., Gunduz, A., Tunc, H. C., Nizam, H., Sakar, B. E., Tutuncu, M., Aydin, T., Isenkul, M. E., & Apaydin, H. (2019). A comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable Q-factor wavelet transform. Applied Soft Computing Journal, 74, 255–263. https://doi.org/10.1016/j.asoc.2018.10.022
DOI: https://doi.org/10.1016/j.asoc.2018.10.022   Google Scholar

Soumaya, Z., Drissi Taoufiq, B., Benayad, N., Yunus, K., & Abdelkrim, A. (2021). The detection of Parkinson disease using the genetic algorithm and SVM classifier. Applied Acoustics, 171, 107528. https://doi.org/10.1016/j.apacoust.2020.107528
DOI: https://doi.org/10.1016/j.apacoust.2020.107528   Google Scholar

Soumaya, Z., Taoufiq, B., Benayad, N., Achraf, B., & Ammoumou, A. (2020). A hybrid method for the diagnosis and classifying parkinson’s patients based on time–frequency domain properties and K-nearest neighbor. Journal of Medical Signals & Sensors, 10(1), 60. https://doi.org/10.4103/jmss.JMSS_61_18
DOI: https://doi.org/10.4103/jmss.JMSS_61_18   Google Scholar

Srivastava, N., Hinton, G., Krizhevsky, A., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. In Journal of Machine Learning Research (Vol. 15).
  Google Scholar

Taoufiq, B. D., Soumaya, Z., Benayad, N., & Nouhaila, B. (2022). Cepstral Coefficient Extraction using the MFCC with the Discrete Wavelet Transform for the Parkinson’s Disease Diagnosis. International Journal of Engineering Trends and Technology, 70(7), 283–290. https://doi.org/10.14445/22315381/IJETT-V70I7P229
DOI: https://doi.org/10.14445/22315381/IJETT-V70I7P229   Google Scholar

Terriza, M., Navarro, J., Retuerta, I., Alfageme, N., San-Segundo, R., Kontaxakis, G., Garcia-Martin, E., Marijuan, P. C., & Panetsos, F. (2022). Use of Laughter for the Detection of Parkinson’s Disease: Feasibility Study for Clinical Decision Support Systems, Based on Speech Recognition and Automatic Classification Techniques. International Journal of Environmental Research and Public Health, 19(17). https://doi.org/10.3390/ijerph191710884
DOI: https://doi.org/10.3390/ijerph191710884   Google Scholar

Valero, X., & Alias, F. (2012). Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification. IEEE Transactions on Multimedia, 14(6), 1684–1689. https://doi.org/10.1109/TMM.2012.2199972
DOI: https://doi.org/10.1109/TMM.2012.2199972   Google Scholar

Yagnavajjula, M. K., Alku, P., Rao, K. S., & Mitra, P. (2022). Detection of Neurogenic Voice Disorders Using the Fisher Vector Representation of Cepstral Features. Journal of Voice. https://doi.org/10.1016/j.jvoice.2022.10.016
DOI: https://doi.org/10.1016/j.jvoice.2022.10.016   Google Scholar

Zahid, L., Maqsood, M., Durrani, M. Y., Bakhtyar, M., Baber, J., Jamal, H., Mehmood, I., & Song, O.-Y. (2020). A Spectrogram-Based Deep Feature Assisted Computer-Aided Diagnostic System for Parkinson’s Disease. IEEE Access, 8, 35482–35495. https://doi.org/10.1109/ACCESS.2020.2974008
DOI: https://doi.org/10.1109/ACCESS.2020.2974008   Google Scholar

Zhang, T., Zhang, Y., Sun, H., & Shan, H. (2021). Parkinson disease detection using energy direction features based on EMD from voice signal. Biocybernetics and Biomedical Engineering, 41(1), 127–141. https://doi.org/10.1016/j.bbe.2020.12.009
DOI: https://doi.org/10.1016/j.bbe.2020.12.009   Google Scholar

Download


Published
2023-06-30

Cited by

BOUALOULOU, N., BELHOUSSINE DRISSI, T., & NSIRI, B. (2023). CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON’S DISEASE BASED ON THE GTCC AND MFCC. Applied Computer Science, 19(2), 1–24. https://doi.org/10.35784/acs-2023-11

Authors

Nouhaila BOUALOULOU 
nouhailaboualoulou21@gmail.com
Laboratory Electrical and Industrial Engineering, Information Processing, Informatics, and Logistics (GEITIIL). Morocco

Authors

Taoufiq BELHOUSSINE DRISSI 

Laboratory Electrical and Industrial Engineering, Information Processing, Informatics, and Logistics Morocco
https://orcid.org/0000-0003-2958-070X

Authors

Benayad NSIRI 

Research Center STIS, M2CS, National Higher School of Arts and Craft, Rabat (ENSAM) Morocco
https://orcid.org/0000-0003-3885-9534

Statistics

Abstract views: 471
PDF downloads: 334


License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.


Similar Articles

1 2 3 4 > >> 

You may also start an advanced similarity search for this article.