COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM

Yedilkhan Amirgaliyev; Kuanyshbay Kuanyshbay; Aisultan Shoiynbek

doi:10.35784/iapgos.234

PORÓWNANIE ALGORYTMÓW OPTYMALIZACJI KLASYFIKATORA CZASOWEGO DO SYSTEMU ROZPOZNAWANIA MOWY

Yedilkhan Amirgaliyev

amir_ed@mail.ru
1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan (Kazachstan)
http://orcid.org/0000-0002-6528-0619

Kuanyshbay Kuanyshbay

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan (Kazachstan)
http://orcid.org/0000-0001-5952-8609

Aisultan Shoiynbek

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan (Kazachstan)
http://orcid.org/0000-0002-9328-8300

DOI: https://doi.org/10.35784/iapgos.234

Abstrakt

W artykule dokonano oceny i porównania wydajności trzech znanych algorytmów optymalizacyjnych (Adagrad, Adam, Momentum) w celu przyspieszenia treningu sieci neuronowej algorytmu CTC do rozpoznawania mowy. Dla algorytmów CTC wykorzystano rekurencyjną sieć neuronową, w szczególności LSTM, która jest efektywnym i często używanym modelem. Dane zostały pobrane z wydziału VCTK Uniwersytetu w Edynburgu. Wyniki algorytmów optymalizacyjnych zostały ocenione na podstawie wskaźników Label error i CTC loss.

Słowa kluczowe:

rekurencyjna sieć neuronowa, metody wyszukiwania, akustyka, język modelowania systemów

Bibliografia

Amirgaliev Y., Hahn M., Mussabayev T.: The speech signal segmentation algorithm using pitch synchronous analysis. Journal Open Computer Science 7(1)/2017, 1–8.
DOI: https://doi.org/10.1515/comp-2017-0001 Google Scholar

Andrychowicz M., Denil M., Colmenarejo S.G., Hoffman M.W., Pfau D., Schaul T., Shillingford B., de Freitas N.: Learning to learn by gradient descent by gradient descent. 30th Conference on Neural Information Processing Systems NIPS 2016.
Google Scholar

Bahdanau D., Cho K., Bengio Y.: Neural machine translation by jointly learning to align and translate. Proc. ICLR, 2015.
Google Scholar

Bengio Y., Ducharme R., Vincent P., Jauvin C.: A Neural Probabilistic Language Model. Journal of Machine Learning Research 3/2003, 1137–1155.
Google Scholar

Bottou L.: Large-Scale Machine Learning with Stochastic Gradient Descent. NEC Labs America, Princeton.
Google Scholar

Duchi J., Hazan E., Singer Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research 12/2011, 2121–2159.
Google Scholar

Gales M., Young S.: The Application of Hidden Markov Models in Speech Recognition. Foundations and Trends in Signal Processing 1(3)/2007, 195–304.
DOI: https://doi.org/10.1561/2000000004 Google Scholar

Graves A., Fernandez S., Gomez F., Schmidhuber J.: Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, 2006.
DOI: https://doi.org/10.1145/1143844.1143891 Google Scholar

Graves A., Jaitly N.: Towards End-to-End Speech Recognition with Recurrent Neural Networks. Proceedings of the 31st International Conference on Machine Learning 2014.
Google Scholar

Kingma D.P., Ba J.: Adam: A Method For Stochastic Optimization. Proc. 3rd International Conference for Learning Representations. 2015 arXiv:1412.6980v9.
Google Scholar

Loizou N., Richtarik P.: Momentum and Stochastic Momentum for Stochastic Gradient, Newton, Proximal Point and Subspace Descent Methods. 2017, arXiv:1712.09677v2
Google Scholar

Mussabayev R.R., Amirgaliyev N., Tairova A.T., Mussabayev T.R., Koibagarov K.C.: The technology for the automatic formation of the personal digital voice pattern. Application of Information and Communication Technologies AICT 2016.
DOI: https://doi.org/10.1109/ICAICT.2016.7991733 Google Scholar

Schuster M., Paliwal K.K.: Bidirectional recurrent neural networks. Signal Processing. IEEE Transactions 45(11)/1997, 2673–2681.
DOI: https://doi.org/10.1109/78.650093 Google Scholar

Sutskever I., Vinyals O., Le Q.V.: Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems 2014, 3104–3112.
Google Scholar

Wiseman S., Rush A.M.: Sequence-to-Sequence Learning as Beam-Search Optimization. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016.
DOI: https://doi.org/10.18653/v1/D16-1137 Google Scholar

Yu D., Li J.: Recent Progresses in Deep Learning based Acoustic Models. Tencent AI Lab, Microsoft AI and Research, 2018.
Google Scholar

Pobierz

pdf (English)

Opublikowane

2019-09-26

Cited By / Share

Captures

Readers: 6

see details

Amirgaliyev, Y., Kuanyshbay, K., & Shoiynbek, A. (2019). PORÓWNANIE ALGORYTMÓW OPTYMALIZACJI KLASYFIKATORA CZASOWEGO DO SYSTEMU ROZPOZNAWANIA MOWY. Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 9(3), 54–57. https://doi.org/10.35784/iapgos.234

Autorzy

Yedilkhan Amirgaliyev
amir_ed@mail.ru
1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan Kazachstan
http://orcid.org/0000-0002-6528-0619

Autorzy

Kuanyshbay Kuanyshbay

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan Kazachstan
http://orcid.org/0000-0001-5952-8609

Autorzy

Aisultan Shoiynbek

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan Kazachstan
http://orcid.org/0000-0002-9328-8300

Statystyki

Abstract views: 305
PDF downloads: 330

Licencja

Utwór dostępny jest na licencji Creative Commons Uznanie autorstwa – Na tych samych warunkach 4.0 Miedzynarodowe.

Inne teksty tego samego autora

Waldemar Wójcik, Maksat Kalimoldayev, Yedilkhan Amirgaliyev, Murat Kunelbayev, Aliya Kalizhanova, Ainur Kozbakova, Timur Merembayev, ANALIZA EGZERGETYCZNA KOLEKTORA SŁONECZNEGO DWUOBWODOWEGO Z OBIEGIEM TERMOSYFONU , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Tom 9 Nr 3 (2019)
Andrii Perekrest, Vladimir Chenchevoi, Olga Chencheva, Alexandr Kovalenko, Mykhailo Kushch-Zhyrko, Aliya Kalizhanova, Yedilkhan Amirgaliyev, MODEL PROGNOZOWANIA SYSTEMÓW GRZEWCZYCH BUDYNKÓW UŻYTECZNOŚCI PUBLICZNEJ: PORÓWNANIE METODY SUPPORT VECTOR MACHINE I RANDOM FOREST , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Tom 12 Nr 3 (2022)
Viacheslav Titov, Olexandr Mozghovyi, Ruslan Borys, Mykola Bogomolov, Yedilkhan Amirgaliyev, Zhalau Aitkulov, TEORETYCZNE I EKSPERYMENTALNE UZASADNIENIE PROCESU CIĄGNIENIA Z PRZERZEDZANIEM BIMETALICZNYCH ELEMENTÓW RUROWYCH Z RÓŻNYCH METALI I STOPÓW , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Tom 13 Nr 2 (2023)

PORÓWNANIE ALGORYTMÓW OPTYMALIZACJI KLASYFIKATORA CZASOWEGO DO SYSTEMU ROZPOZNAWANIA MOWY

Yedilkhan Amirgaliyev

Kuanyshbay Kuanyshbay

Aisultan Shoiynbek

Abstrakt

Słowa kluczowe:

Bibliografia

Autorzy

Autorzy

Autorzy

Statystyki

Licencja

Inne teksty tego samego autora

AKTUALNY NUMER

Zgłoś tekst

Lublin University of Technology Publishing House

Copyright