AUTOMATIC IDENTIFICATION OF DYSPHONIAS USING MACHINE LEARNING ALGORITHMS
Article Sidebar
Open full text
Issue Vol. 19 No. 4 (2023)
-
ENHANCING THE EFFICIENCY OF THE LEVENSHTEIN DISTANCE BASED HEURISTIC METHOD OF ARRANGING 2D APICTORIAL ELEMENTS FOR INDUSTRIAL APPLICATIONS
Stanisław SKULIMOWSKI, Jerzy MONTUSIEWICZ, Marcin BADUROWICZ1-13
-
AUTOMATIC IDENTIFICATION OF DYSPHONIAS USING MACHINE LEARNING ALGORITHMS
Miguel Angel BELLO RIVERA, Carlos Alberto REYES GARCÍA, Tania Cristal TALAVERA ROJAS, Perfecto Malaquías QUINTERO FLORES, Rodolfo Eleazar PÉREZ LOAIZA14-25
-
COMPUTATIONAL ANALYSIS OF PEM FUEL CELL UNDER DIFFERENT OPERATING CONDITIONS
Tomasz SEDERYN, Małgorzata SKAWIŃSKA26-38
-
IMPROVING MATERIAL REQUIREMENTS PLANNING THROUGH WEB-BASED: A CASE STUDY THAILAND SMEs
Pornsiri KHUMLA, Kamthorn SARAWAN39-50
-
PREDICTIVE TOOLS AS PART OF DECISSION AIDING PROCESSES AT THE AIRPORT – THE CASE OF FACEBOOK PROPHET LIBRARY
Sylwester KORGA, Kamil ŻYŁA, Jerzy JÓZWIK, Jarosław PYTKA, Kamil CYBUL51-67
-
IDENTIFYING THE POTENTIAL OF UNMANNED AERIAL VEHICLE ROUTING FOR BLOOD DISTRIBUTION IN EMERGENCY REQUESTS
Janani DEWMINI, W Madushan FERNANDO, Izabela Iwa NIELSEN, Grzegorz BOCEWICZ, Amila THIBBOTUWAWA, Zbigniew BANASZAK68-87
-
EFFICIENCY COMPARISON OF NETWORKS IN HANDWRITTEN LATIN CHARACTERS RECOGNITION WITH DIACRITICS
Edyta ŁUKASIK, Wiktor FLIS88-102
-
THE EFFECT OF INFORMATION TECHNOLOGY AND ENTREPRENEURSHIP ON THE E-SERVICES QUALITY THAT HAVE AN IMPACT ON CUSTOMER VALUE: EVIDENCE FROM INDONESIA SMEs
Ferra Arik TRIDALESTARI, Hanung Nindito PRASETYO103-120
-
IMPLICATIONS OF NEURAL NETWORK AS A DECISION-MAKING TOOL IN MANAGING KAZAKHSTAN’S AGRICULTURAL ECONOMY
Monika KULISZ, Aigerim DUISENBEKOVA, Justyna KUJAWSKA, Danira KALDYBAYEVA, Bibigul ISSAYEVA, Piotr LICHOGRAJ, Wojciech CEL121-135
-
COMPARISON OF SELECTED CLASSIFICATION METHODS BASED ON MACHINE LEARNING AS A DIAGNOSTIC TOOL FOR KNEE JOINT CARTILAGE DAMAGE BASED ON GENERATED VIBROACOUSTIC PROCESSES
Robert KARPIŃSKI, Przemysław KRAKOWSKI, Józef JONAK, Anna MACHROWSKA, Marcin MACIEJEWSKI136-150
Archives
-
Vol. 21 No. 3
2025-10-05 12
-
Vol. 21 No. 2
2025-06-27 12
-
Vol. 21 No. 1
2025-03-31 12
-
Vol. 20 No. 4
2025-01-31 12
-
Vol. 20 No. 3
2024-09-30 12
-
Vol. 20 No. 2
2024-08-14 12
-
Vol. 20 No. 1
2024-03-30 12
-
Vol. 19 No. 4
2023-12-31 10
-
Vol. 19 No. 3
2023-09-30 10
-
Vol. 19 No. 2
2023-06-30 10
-
Vol. 19 No. 1
2023-03-31 10
-
Vol. 18 No. 4
2022-12-30 8
-
Vol. 18 No. 3
2022-09-30 8
-
Vol. 18 No. 2
2022-06-30 8
-
Vol. 18 No. 1
2022-03-30 7
-
Vol. 17 No. 4
2021-12-30 8
-
Vol. 17 No. 3
2021-09-30 8
-
Vol. 17 No. 2
2021-06-30 8
-
Vol. 17 No. 1
2021-03-30 8
Main Article Content
DOI
Authors
Abstract
Dysphonia is a prevalent symptom of some respiratory diseases that affects voice quality, even for prolonged periods. For its diagnosis, speech-language pathologists make use of different acoustic parameters to perform objective evaluations on patients and determine the type of dysphonia that affects them, such as hyperfunctional and hypofunctional dysphonia, which is important because each type requires a different treatment. In the field of artificial intelligence this problem has been addressed through the use of acoustic parameters that are used as input data to train machine learning and deep learning models. However, its purpose is usually to identify whether a patient is ill or not, making binary classifications between healthy voices and voices with dysphonia, but not between dysphonias. In this paper, harmonic-to-noise ratio, cepstral peak prominence-smoothed, zero crossing rate and the means of the Mel frequency cepstral coefficients (2-19) are used to make multiclass classification of voices with euphony, hyperfunction and hypofunction by means of six machine learning algorithms, which are: Random Forest, K nearest neighbors, Logistic regression, Decision trees, Support vector machines and Naive Bayes. In order to evaluate which of them presents a better performance to identify the three voice classes, bootstrap.632 was used. It is concluded that the best confidence interval ranges from 87% to 92%, in terms of accuracy for the K Nearest Neighbors model. Results can be implemented in the development of a complementary application for the clinical diagnosis or monitoring of a patient under the supervision of a specialist.
Keywords:
References
Altayeb, M., & Al-Ghraibah, A. (2022). Classification of three pathological voices based on specific features groups using support vector machine. International Journal of Electrical and Computer Engineering (IJECE), 12(1), 946-956. https://doi.org/10.11591/ijece.v12i1.pp946-956 DOI: https://doi.org/10.11591/ijece.v12i1.pp946-956
Behlau, M., & Pontes, P. (1989). Avaliação Global da Voz. Editora Paulista Publicações Médicas.
Behlau, M., Madazio, G., Feijó, D., Azevedo, R., Gielow, I., & Rehder, M. (2005). Perfeccionamiento vocal y tratamiento fonoaudiológico de las disfonías. In M. Behlau (Eds.), Voz: O livro do especialista. Thieme Revinter.
Celdrán, E. M. (2015). Naturaleza fonética de la consonante ‘ye’en español. Normas: revista de estudios lingüísticos hispánicos, 5, 117-131. https://doi.org/10.7203/Normas.5.6825 DOI: https://doi.org/10.7203/Normas.5.6825
Cesari, U., De Pietro, G., Marciano, E., Niri, C., Sannino, G., & Verde, L. (2018). A new database of healthy and pathological voices. Computers & Electrical Engineering, 68, 310-321. https://doi.org/10.1016/j.compeleceng.2018.04.008 DOI: https://doi.org/10.1016/j.compeleceng.2018.04.008
Charbuty, B., & Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01), 20-28. https://doi.org/10.38094/jastt20165 DOI: https://doi.org/10.38094/jastt20165
Chen, L., & Chen, J. (2022). Deep neural network for automatic classification of pathological voice signals. Journal of Voice, 36(2), 288.e15-288.e24. https://doi.org/10.1016/j.jvoice.2020.05.029 DOI: https://doi.org/10.1016/j.jvoice.2020.05.029
Daniels, L., & Minot, N. (2019). An introduction to statistics and data analysis using Stata®: From research design to final report. Sage Publications.
Descamps, G., Verset, L., Trelcat, A., Hopkins, C., Lechien, J. R., Journe, F., & Saussez, S. (2020). ACE2 protein landscape in the head and neck region: the conundrum of SARS-CoV-2 infection. Biology, 9(8), 235. https://doi.org/10.3390%2Fbiology9080235 DOI: https://doi.org/10.3390/biology9080235
Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation. Journal of the American Statistical Association, 78(382), 316-331. https://doi.org/10.2307/2288636 DOI: https://doi.org/10.1080/01621459.1983.10477973
Farias, P. (2016). Guía clínica para el especialista en laringe y voz. Librería Akadia Editorial.
Flórez-Gómez, A. F., Orozco-Arroyave, J. R., & Roldán-Vasco, S. (2022). Correlación entre espacios de características acústicas del habla y trastornos clínicos de la voz en pacientes con disfagia. TecnoLógicas, 25(53), e2220. https://doi.org/10.22430/22565337.2220 DOI: https://doi.org/10.22430/22565337.2220
Hassan, A., Shahin, I., & Alsabek, M. B. (2020). COVID-19 detection system using recurrent neural networks. 2020 International conference on communications, computing, cybersecurity, and informatics (CCCI) (pp. 1-5). IEEE. https://doi.org/10.1109/CCCI49893.2020.9256562 DOI: https://doi.org/10.1109/CCCI49893.2020.9256562
Hoffmann, M., Kleine-Weber, H., Schroeder, S., Krüger, N., Herrler, T., Erichsen, S., Schiergens, T. S., Herrler, G., Wu, N.-H., Nitsche, A., Müller, M. A., Drosten, C., & Pöhlmann, S. (2020). SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell, 181(2), 271-280.e8. https://doi.org/10.1016/j.cell.2020.02.052 DOI: https://doi.org/10.1016/j.cell.2020.02.052
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. Springer. DOI: https://doi.org/10.1007/978-1-4614-7138-7
López, J. A. P. (1997). Los trastornos de la voz en el personal docente de logroño. Estudio de la voz en los profesionales de la enseñanza. (Doctoral dissertation, Universidad de Navarra).
López, J. A. P. (2000). Estudio de la prevalencia de los trastornos de la voz en el personal docente de Logroño. Zubía, 12, 111-145.
Murphy, K. P. (2006). Naive bayes classifiers. University of British Columbia, 18(60), 1-8.
Núñez-Batalla, F., Cartón-Corona, N., Vasile, G., García-Cabo, P., Fernández-Vanes, L., & Llorente-Pendás, J. L. (2019). Validez de las medidas del pico cepstral para la valoración objetiva de la disfonía en sujetos de habla hispana. Acta Otorrinolaringológica Española, 70(4), 222-228. https://doi.org/10.1016/j.otoeng.2018.04.005 DOI: https://doi.org/10.1016/j.otorri.2018.04.008
Radha, N., Sachin Madhavan, R. M., & Sameera holy, S. (2021). Parkinson’s Disease detection using Machine Learning Techniques. International Journal of Early Childhood Special Education (INT-JECSE), 30(2), 543. https://doi.org/10.24205/03276716.2020.4055
Rivera, M. A. B., Flores, P. M. Q., Loaiza, R. E. P., & Rivera, L. G. (2022). Analysis of audio signals using deep learning algorithms applied to COVID diagnostic systems. 2022 IEEE Mexican International Conference on Computer Science (ENC) (pp. 1-6). IEEE. https://doi.org/10.1109/ENC56672.2022.9882932 DOI: https://doi.org/10.1109/ENC56672.2022.9882932
Schonlau, M., & Zou, R. Y. (2020). The random forest algorithm for statistical learning. The Stata Journal, 20(1), 3-29. https://doi.org/10.1177/1536867X20909688 DOI: https://doi.org/10.1177/1536867X20909688
Taunk, K., De, S., Verma, S., & Swetapadma, A. (2019). A brief review of nearest neighbor algorithm for learning and classification. 2019 International Conference on Intelligent Computing and Control Systems (ICCS) (pp. 1255-1260). IEEE. https://doi.org/10.1109/ICCS45141.2019.9065747 DOI: https://doi.org/10.1109/ICCS45141.2019.9065747
Verdaguer, J. M., Górriz, C., Prim, M. P., del Palacio, A. J., Gavilán, J., & de Diego, J. I. (2008). Análisis de los cambios en el espectrograma tras la intubación endotraqueal. Acta Otorrinolaringológica Española, 59(5), 217-222. https://doi.org/10.1016/S0001-6519(08)73298-9 DOI: https://doi.org/10.1016/S0001-6519(08)73298-9
Verde, L., De Pietro, G., Alrashoud, M., Ghoneim, A., Al-Mutib, K. N., & Sannino, G. (2019). Leveraging artificial intelligence to improve voice disorder identification through the use of a reliable mobile app. IEEE Access, 7, 124048-124054. https://doi.org/10.1109/ACCESS.2019.2938265 DOI: https://doi.org/10.1109/ACCESS.2019.2938265
Woldert-Jokisz, B. (2007). Saarbruecken voice database. Computer Science.
Article Details
Abstract views: 692
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
