COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM

Yedilkhan Amirgaliyev; Kuanyshbay Kuanyshbay; Aisultan Shoiynbek

doi:10.35784/iapgos.234

COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM

Yedilkhan Amirgaliyev

amir_ed@mail.ru
1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan (Kazakhstan)
http://orcid.org/0000-0002-6528-0619

Kuanyshbay Kuanyshbay

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan (Kazakhstan)
http://orcid.org/0000-0001-5952-8609

Aisultan Shoiynbek

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan (Kazakhstan)
http://orcid.org/0000-0002-9328-8300

DOI: https://doi.org/10.35784/iapgos.234

Abstract

This paper evaluates and compares the performances of three well-known optimization algorithms (Adagrad, Adam, Momentum) for faster training the neural network of CTC algorithm for speech recognition. For CTC algorithms recurrent neural network has been used, specifically Long-Short-Term memory. LSTM is effective and often used model. Data has been downloaded from VCTK corpus of Edinburgh University. The results of optimization algorithms have been evaluated by the Label error rate and CTC loss.

Keywords:

recurrent neural network, search methods, acoustic, systems modeling language

References

Amirgaliev Y., Hahn M., Mussabayev T.: The speech signal segmentation algorithm using pitch synchronous analysis. Journal Open Computer Science 7(1)/2017, 1–8.
DOI: https://doi.org/10.1515/comp-2017-0001 Google Scholar

Andrychowicz M., Denil M., Colmenarejo S.G., Hoffman M.W., Pfau D., Schaul T., Shillingford B., de Freitas N.: Learning to learn by gradient descent by gradient descent. 30th Conference on Neural Information Processing Systems NIPS 2016.
Google Scholar

Bahdanau D., Cho K., Bengio Y.: Neural machine translation by jointly learning to align and translate. Proc. ICLR, 2015.
Google Scholar

Bengio Y., Ducharme R., Vincent P., Jauvin C.: A Neural Probabilistic Language Model. Journal of Machine Learning Research 3/2003, 1137–1155.
Google Scholar

Bottou L.: Large-Scale Machine Learning with Stochastic Gradient Descent. NEC Labs America, Princeton.
Google Scholar

Duchi J., Hazan E., Singer Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research 12/2011, 2121–2159.
Google Scholar

Gales M., Young S.: The Application of Hidden Markov Models in Speech Recognition. Foundations and Trends in Signal Processing 1(3)/2007, 195–304.
DOI: https://doi.org/10.1561/2000000004 Google Scholar

Graves A., Fernandez S., Gomez F., Schmidhuber J.: Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, 2006.
DOI: https://doi.org/10.1145/1143844.1143891 Google Scholar

Graves A., Jaitly N.: Towards End-to-End Speech Recognition with Recurrent Neural Networks. Proceedings of the 31st International Conference on Machine Learning 2014.
Google Scholar

Kingma D.P., Ba J.: Adam: A Method For Stochastic Optimization. Proc. 3rd International Conference for Learning Representations. 2015 arXiv:1412.6980v9.
Google Scholar

Loizou N., Richtarik P.: Momentum and Stochastic Momentum for Stochastic Gradient, Newton, Proximal Point and Subspace Descent Methods. 2017, arXiv:1712.09677v2
Google Scholar

Mussabayev R.R., Amirgaliyev N., Tairova A.T., Mussabayev T.R., Koibagarov K.C.: The technology for the automatic formation of the personal digital voice pattern. Application of Information and Communication Technologies AICT 2016.
DOI: https://doi.org/10.1109/ICAICT.2016.7991733 Google Scholar

Schuster M., Paliwal K.K.: Bidirectional recurrent neural networks. Signal Processing. IEEE Transactions 45(11)/1997, 2673–2681.
DOI: https://doi.org/10.1109/78.650093 Google Scholar

Sutskever I., Vinyals O., Le Q.V.: Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems 2014, 3104–3112.
Google Scholar

Wiseman S., Rush A.M.: Sequence-to-Sequence Learning as Beam-Search Optimization. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016.
DOI: https://doi.org/10.18653/v1/D16-1137 Google Scholar

Yu D., Li J.: Recent Progresses in Deep Learning based Acoustic Models. Tencent AI Lab, Microsoft AI and Research, 2018.
Google Scholar

Download

Published

2019-09-26

Cited by

Amirgaliyev, Y., Kuanyshbay, K., & Shoiynbek, A. (2019). COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM . Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 9(3), 54–57. https://doi.org/10.35784/iapgos.234

Authors

Yedilkhan Amirgaliyev
amir_ed@mail.ru
1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan Kazakhstan
http://orcid.org/0000-0002-6528-0619

Authors

Kuanyshbay Kuanyshbay

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan Kazakhstan
http://orcid.org/0000-0001-5952-8609

Authors

Aisultan Shoiynbek

1 Institute Information and Computational Technologies CS MES RK, Almaty, Kazakhstan, 2 Suleyman Demirel University, Almaty, Kazakhstan Kazakhstan
http://orcid.org/0000-0002-9328-8300

Statistics

Abstract views: 305
PDF downloads: 331

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Most read articles by the same author(s)

Waldemar Wójcik, Maksat Kalimoldayev, Yedilkhan Amirgaliyev, Murat Kunelbayev, Aliya Kalizhanova, Ainur Kozbakova, Timur Merembayev, EXERGY ANALYSIS OF DOUBLE-CIRCUIT FLAT SOLAR COLLECTOR WITH THERMOSYPHON CIRCULATION , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Vol. 9 No. 3 (2019)
Andrii Perekrest, Vladimir Chenchevoi, Olga Chencheva, Alexandr Kovalenko, Mykhailo Kushch-Zhyrko, Aliya Kalizhanova, Yedilkhan Amirgaliyev, PREDICTION MODEL OF PUBLIC HOUSES’ HEATING SYSTEMS: A COMPARISON OF SUPPORT VECTOR MACHINE METHOD AND RANDOM FOREST METHOD , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Vol. 12 No. 3 (2022)
Viacheslav Titov, Olexandr Mozghovyi, Ruslan Borys, Mykola Bogomolov, Yedilkhan Amirgaliyev, Zhalau Aitkulov, THEORETICAL AND EXPERIMENTAL SUBSTANTIATION OF THE EXTRACTION PROCESS WITH THINNING BIMETALLIC TUBULAR ELEMENTS OF DISSIMILAR METALS AND ALLOYS , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Vol. 13 No. 2 (2023)

COMPARISON OF OPTIMIZATION ALGORITHMS OF CONNECTIONIST TEMPORAL CLASSIFIER FOR SPEECH RECOGNITION SYSTEM

Yedilkhan Amirgaliyev

Kuanyshbay Kuanyshbay

Aisultan Shoiynbek

Abstract

Keywords:

References

Authors

Authors

Authors

Statistics

License

Most read articles by the same author(s)

CURRENT ISSUE

Make a Submission

Lublin University of Technology Publishing House

Copyright