A COUGH-BASED COVID-19 DETECTION SYSTEM USING PCA AND MACHINE LEARNING CLASSIFIERS
Elmehdi BENMALEK
elmehdi.benmalek@um5s.net.maE2SN, ENSAM de Rabat, Mohammed V University in Rabat (Morocco)
Jamal EL MHAMDI
E2SN, ENSAM de Rabat, Mohammed V University in Rabat (Morocco)
Abdelilah JILBAB
E2SN, ENSAM de Rabat, Mohammed V University in Rabat, (Morocco)
Atman JBARI
E2SN, ENSAM de Rabat, Mohammed V University in Rabat (Morocco)
Abstract
In 2019, the whole world is facing a health emergency due to the emergence of the coronavirus (COVID-19). About 223 countries are affected by the coronavirus. Medical and health services face difficulties to manage the disease, which requires a significant amount of health system resources. Several artificial intelligence-based systems are designed to automatically detect COVID-19 for limiting the spread of the virus. Researchers have found that this virus has a major impact on voice production due to the respiratory system's dysfunction. In this paper, we investigate and analyze the effectiveness of cough analysis to accurately detect COVID-19. To do so, we performed binary classification, distinguishing positive COVID patients from healthy controls. The records are collected from the Coswara Dataset, a crowdsourcing project from the Indian Institute of Science (IIS). After data collection, we extracted the MFCC from the cough records. These acoustic features are mapped directly to the Decision Tree (DT), k-nearest neighbor (kNN) for k equals to 3, support vector machine (SVM), and deep neural network (DNN), or after a dimensionality reduction using principal component analysis (PCA), with 95 percent variance or 6 principal components. The 3NN classifier with all features has produced the best classification results. It detects COVID-19 patients with an accuracy of 97.48 percent, 96.96 percent f1-score, and 0.95 MCC. Suggesting that this method can accurately distinguish healthy controls and COVID-19 patients.
Keywords:
COVID-19, cough recordings, machine learning, PCA, classificationReferences
Adhatrao, K., Gaykar, A., Dhawan, A., Jha, R., & Honrao, V. (2013). Predicting students' performance using ID3 and C4. 5 classification algorithms. arXiv preprint arXiv:1310.2071.
DOI: https://doi.org/10.5121/ijdkp.2013.3504
Google Scholar
Ai, T., Yang, Z., Hou, H., Zhan, C., Chen, C., Lv, W., Tao, Q., Sun, Z., & Xia, L. (2020). Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology, 296(2), E32-E40. https://doi.org/10.1148/radiol.2020200642
DOI: https://doi.org/10.1148/radiol.2020200642
Google Scholar
Aly, M., Rahouma, K. H., & Ramzy, S. M. (2022). Pay attention to the speech: COVID-19 diagnosis using machine learning and crowdsourced respiratory and speech recordings. Alexandria Engineering Journal, 61(5), 3487–3500. https://doi.org/10.1016/j.aej.2021.08.070
DOI: https://doi.org/10.1016/j.aej.2021.08.070
Google Scholar
Anuradha, C., & Velmurugan, T. (2014). A data mining based survey on student performance evaluation system. In 2014 IEEE International Conference on Computational Intelligence and Computing Research (pp. 1–4). IEEE. https://doi.org/10.1109/ICCIC.2014.7238389
DOI: https://doi.org/10.1109/ICCIC.2014.7238389
Google Scholar
Anusuya, M. A., & Katti, S. K. (2010). Speech recognition by machine, a review. arXiv preprint arXiv:1001.2267.
Google Scholar
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends® in Machine Learning, 2(1), 1–127.
DOI: https://doi.org/10.1561/2200000006
Google Scholar
Benmalek, E., Elmhamdi, J., & Jilbab, A. (2021). Comparing CT scan and chest X-ray imaging for COVID-19 diagnosis. Biomedical Engineering Advances, 1, 100003. https://doi.org/10.1016/j.bea.2021.100003
DOI: https://doi.org/10.1016/j.bea.2021.100003
Google Scholar
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory (pp. 144–152). The ACM Digital Library.
DOI: https://doi.org/10.1145/130385.130401
Google Scholar
Brown, C., Chauhan, J., Grammenos, A., Han, J., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., & Mascolo, C. (2020). Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data. arXiv preprint arXiv:2006.05919.
DOI: https://doi.org/10.1145/3394486.3412865
Google Scholar
Chaudhari, G., Jiang, X., Fakhry, A., Han, A., Xiao, J., Shen, S., & Khanzada, A. (2020). Virufy: Global applicability of crowdsourced and clinical datasets for AI detection of COVID-19 from cough. arXiv preprint arXiv:2011.13320.
Google Scholar
Coppock, H., Gaskell, A., Tzirakis, P., Baird, A., Jones, L., & Schuller, B. (2021). End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: a pilot study. BMJ innovations, 7(2), 356–362. https://doi.org/10.1136/bmjinnov-2021-000668
DOI: https://doi.org/10.1136/bmjinnov-2021-000668
Google Scholar
Fakhry, A., Jiang, X., Xiao, J., Chaudhari, G., Han, A., & Khanzada, A. (2021). Virufy: A multi-branch deep learning network for automated detection of COVID-19. arXiv preprint arXiv:2103.01806.
DOI: https://doi.org/10.21437/Interspeech.2021-378
Google Scholar
Han, J., Brown, C., Chauhan, J., Grammenos, A., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., & Mascolo, C. (2021). Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data. In ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8328–8332). IEEE.
DOI: https://doi.org/10.1109/ICASSP39728.2021.9414576
Google Scholar
Han, W., Chan, C. F., Choy, C. S., & Pun, K. P. (2006). An efficient MFCC extraction method in speech recognition. In 2006 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 4). IEEE.
DOI: https://doi.org/10.1109/ISCAS.2006.1692543
Google Scholar
Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing (pp. 604–613). The ACM Digital Library.
DOI: https://doi.org/10.1145/276698.276876
Google Scholar
Ismail, M. A., Deshmukh, S., & Singh, R. (2021). Detection of COVID-19 through the analysis of vocal fold oscillations. In ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1035–1039). IEEE.
DOI: https://doi.org/10.1109/ICASSP39728.2021.9414201
Google Scholar
Laguarta, J., Hueto, F., & Subirana, B. (2020). COVID-19 Artificial Intelligence Diagnosis using only Cough Recordings. In IEEE Open Journal of Engineering in Medicine and Biology (vol. 1, 275–281). IEEE. https://doi.org/10.1109/OJEMB.2020.3026928
DOI: https://doi.org/10.1109/OJEMB.2020.3026928
Google Scholar
Li, L., Qin, L., Xu, Z., Yin, Y., Wang, X., Kong, B., Bai, J., Lu, Y., Fang, Z., Song, Q., Cao, K., Liu, D., Wang, G., Xu, Q., Fang, X., Zhang, S., Xia, J., & Xia, J. (2020). Using Artificial Intelligence to Detect COVID19 and Community-acquired Pneumonia Based on Pulmonary CT: Evaluation of the Diagnostic Accuracy. Radiology, 296(2), E65–E71. https://doi.org/10.1148/radiol.2020200905
DOI: https://doi.org/10.1148/radiol.2020200905
Google Scholar
Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083.
Google Scholar
Nassif, A. B., Shahin, I., Attili, I., Azzeh, M., & Shaalan, K. (2019). Speech recognition using deep neural networks: A systematic review. IEEE access, 7, 19143–19165. https://doi.org/10.1109/ACCESS.2019.2896880
DOI: https://doi.org/10.1109/ACCESS.2019.2896880
Google Scholar
Pahar, M., Klopper, M., Warren, R., & Niesler, T. (2021). COVID-19 cough classification using machine learning and global smartphone recordings. Computers in Biology and Medicine, 135, 104572. https://doi.org/10.1016/j.compbiomed.2021.104572
DOI: https://doi.org/10.1016/j.compbiomed.2021.104572
Google Scholar
Pal, A., & Sankarasubbu, M. (2021). Pay attention to the cough: Early diagnosis of COVID-19 using interpretable symptoms embeddings with cough sound signal processing. In Proceedings of the 36th Annual ACM Symposium on Applied Computing (pp. 620–628). The ACM Digital Library. https://doi.org/10.1145/3412841.3441943
DOI: https://doi.org/10.1145/3412841.3441943
Google Scholar
Pisner, D. A., & Schnyer, D. M. (2020). Support vector machine.In Machine learning (pp. 101–121). Academic Press.
DOI: https://doi.org/10.1016/B978-0-12-815739-8.00006-7
Google Scholar
Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1), 81–106.
DOI: https://doi.org/10.1007/BF00116251
Google Scholar
Sharma, N., Krishnan, P., Kumar, R., Ramoji, S., Chetupalli, S. R., Ghosh, P. K., & Ganapathy, S. (2020).
Google Scholar
Coswara--a database of breathing, cough, and voice sounds for COVID-19 diagnosis. arXiv preprintarXiv:2005.10548.
Google Scholar
Singh, H., & Bathla, A. K. (2013). A survey on speech recognition. International Journal of Advanced Research in Computer Engineering & Technology, 2(6), 2186–2189.
Google Scholar
Weng, L. M., Su, X., & Wang, X. Q. (2021). Pain symptoms in patients with coronavirus disease (COVID-19): a literature review. Journal of Pain Research, 14, 147. https://doi.org/10.2147/JPR.S269206
DOI: https://doi.org/10.2147/JPR.S269206
Google Scholar
Wu, X., Hui, H., Niu, M., Li, L., Wang, L., He, B., Yang, X., Li, L. Li, H., Tian, J., & Zha, Y. (2020). Deep learning-based multi-view fusion model for screening 2019 novel coronavirus pneumonia: a multicentre study. European Journal of Radiology, 128, 109041. https://doi.org/10.1016/j.ejrad.2020.109041
DOI: https://doi.org/10.1016/j.ejrad.2020.109041
Google Scholar
Yang, Y., Yang, M., Shen, C., Wang, F., Yuan, J., Li, J., Zhang, M., Wang, Z., Xing, L. Wei, J., Peng, L., Wong, G., Zheng, H., Wu, W., Liao, M., Feng, K., Li, J., Yang, Q., Zhao, J., Zhang, Z., Liu, L., & Liu, Y. (2020). Evaluating the accuracy of different respiratory specimens in the laboratory diagnosis and monitoring the viral shedding of 2019-nCoV infections. MedRxiv. https://doi.org/10.1101/2020.02.11.20021493
DOI: https://doi.org/10.1101/2020.02.11.20021493
Google Scholar
Zheng, F., Zhang, G., & Song, Z. (2001). Comparison of different implementations of MFCC. Journal of Computer science and Technology, 16(6), 582–589. https://doi.org/10.1007/BF02943243
DOI: https://doi.org/10.1007/BF02943243
Google Scholar
Authors
Elmehdi BENMALEKelmehdi.benmalek@um5s.net.ma
E2SN, ENSAM de Rabat, Mohammed V University in Rabat Morocco
Authors
Jamal EL MHAMDIE2SN, ENSAM de Rabat, Mohammed V University in Rabat Morocco
Authors
Abdelilah JILBABE2SN, ENSAM de Rabat, Mohammed V University in Rabat, Morocco
Authors
Atman JBARIE2SN, ENSAM de Rabat, Mohammed V University in Rabat Morocco
Statistics
Abstract views: 249PDF downloads: 132
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Anusha NALLAPAREDDY, DETECTION AND CLASSIFICATION OF VEGETATION AREAS FROM RED AND NEAR INFRARED BANDS OF LANDSAT-8 OPTICAL SATELLITE IMAGE , Applied Computer Science: Vol. 18 No. 1 (2022)
- Błażej CZAJKA, Patryk RÓŻYŁO, Hubert DĘBSKI, STABILITY AND FAILURE OF THIN-WALLED COMPOSITE STRUCTURES WITH A SQUARE CROSS-SECTION , Applied Computer Science: Vol. 18 No. 2 (2022)
- Anna MACHROWSKA, Robert KARPIŃSKI, Przemysław KRAKOWSKI, Józef JONAK, DIAGNOSTIC FACTORS FOR OPENED AND CLOSED KINEMATIC CHAIN OF VIBROARTHROGRAPHY SIGNALS , Applied Computer Science: Vol. 15 No. 3 (2019)
- Lukas BAUER, Leon STÜTZ, Markus KLEY, BLACK BOX EFFICIENCY MODELLING OF AN ELECTRIC DRIVE UNIT UTILIZING METHODS OF MACHINE LEARNING , Applied Computer Science: Vol. 17 No. 4 (2021)
- Eduardo Sánchez-García, Javier Martínez-Falcó, Bartolomé Marco-Lajara, Jolanta Słoniec, ANALYZING THE ROLE OF COMPUTER SCIENCE IN SHAPING MODERN ECONOMIC AND MANAGEMENT PRACTICES. BIBLIOMETRIC ANALYSIS , Applied Computer Science: Vol. 20 No. 1 (2024)
- Monika KULISZ, Aigerim DUISENBEKOVA, Justyna KUJAWSKA, Danira KALDYBAYEVA, Bibigul ISSAYEVA, Piotr LICHOGRAJ, Wojciech CEL, IMPLICATIONS OF NEURAL NETWORK AS A DECISION-MAKING TOOL IN MANAGING KAZAKHSTAN’S AGRICULTURAL ECONOMY , Applied Computer Science: Vol. 19 No. 4 (2023)
- Siti ROHAJAWATI, Hutanti SETYODEWI, Ferryansyah Muji Agustian TRESNANTO, Debora MARIANTHI, Maruli Tua Baja SIHOTANG , KNOWLEDGE MANAGEMENT APPROACH IN COMPARATIVE STUDY OF AIR POLLUTION PREDICTION MODEL , Applied Computer Science: Vol. 20 No. 1 (2024)
- Arkadiusz GOLA, Łukasz WIECHETEK, MODELLING AND SIMULATION OF PRODUCTION FLOW IN JOB-SHOP PRODUCTION SYSTEM WITH ENTERPRISE DYNAMICS SOFTWARE , Applied Computer Science: Vol. 13 No. 4 (2017)
- Puppala Praneeth, Majety Sathvika, Vivek Kommareddy, Madala Sarath, Saran Mallela, Koneru Suvarna Vani, Prasun Chkrabarti, CLASSIFICATION OF PARKINSON'S DISEASE IN BRAIN MRI IMAGES USING DEEP RESIDUAL CONVOLUTIONAL NEURAL NETWORK , Applied Computer Science: Vol. 19 No. 2 (2023)
- Robert KARPIŃSKI, Józef JONAK, Jacek MAKSYMIUK, MEDICAL IMAGING AND 3D RECONSTRUCTION FOR OBTAINING THE GEOMETRICAL AND PHYSICAL MODEL OF A CONGENITAL BILATERAL RADIO-ULNAR SYNOSTOSIS , Applied Computer Science: Vol. 14 No. 1 (2018)
You may also start an advanced similarity search for this article.