Comparison of LeNet-5, AlexNet and GoogLeNet models in handwriting recognition
Bartosz Michalski
Department of Computer Science, Lublin University of Technology (Poland)
Małgorzata Plechawska-Wójcik
m.plechawska@pollub.plDepartment of Computer Science, Lublin University of Technology (Poland)
Abstract
The aim of the study was to compare the accuracy of handwriting recognition and the time needed to classify data from the test sets. The Lenet-5, AlexNet and GoogLeNet architectures were used for the research. They are all models of convolutional neural networks. The research was carried out with the use of image databases, handwritten digits MNIST and handwritten letters EMNIST. After the tests, it was found that the GoogLeNet model showed the highest accuracy, and the LeNet-5 the lowest. However, the LeNet-5 model needed the least time to complete the task, and GoogLeNet the most. On the basis of the obtained results, it was found that increasing the complexity of the model positively influences the accuracy of object classification, but significantly increases the demand for computer re-sources.
Keywords:
convolutional neural networks; handwriting classificationReferences
D. O. Hebb, The organisation of behaviour: a neuropsychological theory. New York: Science Editions (1949).
Google Scholar
F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review 65(6) (1958) 386.
DOI: https://doi.org/10.1037/h0042519
Google Scholar
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard., W. Hubbard, L. D. Jackel, Backpropagation applied to handwritten zip code recognition. Neural computation 1(4) (1989) 541-551.
DOI: https://doi.org/10.1162/neco.1989.1.4.541
Google Scholar
O. Russakovsky, J. Deng, H. Su, et al. ImageNet Large Scale Visual Recognition Challenge. Int J Comput Vis 115 (2015) 211–252. https://doi.org/10.1007/s11263-015-0816-y
DOI: https://doi.org/10.1007/s11263-015-0816-y
Google Scholar
Ü. Budak, A. Şengür, U. Halici, Deep convolutional neural networks for airport detection in remote sensing images. 26th Signal Processing and Communications Applications Conference (SIU) (2018) 1-4, doi: 10.1109/SIU.2018.8404195.
DOI: https://doi.org/10.1109/SIU.2018.8404195
Google Scholar
M. J. Aitkenhead, A. J. S. McDonald. A neural network face recognition system. Engineering Applications of Artificial Intelligence 16(3) (2003) 167-176.
DOI: https://doi.org/10.1016/S0952-1976(03)00042-3
Google Scholar
D. S. Maitra, U. Bhattacharya, S. K. Parui, CNN based common approach to handwritten character recognition of multiple scripts. 13th International Conference on Document Analysis and Recognition (ICDAR) (2015) 1021-1025, doi: 10.1109/ICDAR.2015.7333916.
DOI: https://doi.org/10.1109/ICDAR.2015.7333916
Google Scholar
K. Nygren, Stock prediction–a neural network approach. Royal Instiute of Technology (2004) 1-34.
Google Scholar
S. S. Baboo, I. K. Shereef, An efficient weather forecasting system using artificial neural network. International journal of environmental science and development 1(4) (2010) 321.
DOI: https://doi.org/10.7763/IJESD.2010.V1.63
Google Scholar
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (2009) 248–255.
DOI: https://doi.org/10.1109/CVPR.2009.5206848
Google Scholar
Y. LeCun, C. Cortes, The MNIST database of handwritten digits (2005).
Google Scholar
G. Cohen, S. Afshar, J. Tapson, A. van Schaik, EMNIST: an extension of MNIST to handwritten letters (2017) arXiv:1702.05373.
DOI: https://doi.org/10.1109/IJCNN.2017.7966217
Google Scholar
Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, Gradient-based learning applied to document recognition in Proceedings of the IEEE 86(11) (1998) 2278-2324, doi: 10.1109/5.726791.
DOI: https://doi.org/10.1109/5.726791
Google Scholar
A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks. In F. Pereira, C. Burges, L. Bottou, K. Weinberger, eds., Advances in Neural Information Processing Systems 25. Curran Associates (2012) 1097–1105. arXiv:1803.01164
Google Scholar
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going Deeper with Convolutions (2014) arXiv:1409.4842.
DOI: https://doi.org/10.1109/CVPR.2015.7298594
Google Scholar
K. O'Shea, R. Nash, An introduction to convolutional neural networks (2015) arXiv preprint arXiv:1511.08458.
Google Scholar
Grother, P. J, NIST special database 19. Handprinted forms and characters database, National Institute of Standards and Technology (1995).
Google Scholar
W. S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics 5(4) (1943) 115-133.
DOI: https://doi.org/10.1007/BF02478259
Google Scholar
E. Lukasik, M. Charytanowicz, M. Milosz, M. Tokovarov, M. Kaczorowska, D. Czerwinski, T. Zientarski, Recognition of handwritten Latin characters with diacritics using CNN. Bulletin of the Polish Academy of Sciences. Technical Sciences 69(1) (2021).
Google Scholar
Authors
Bartosz MichalskiDepartment of Computer Science, Lublin University of Technology Poland
Authors
Małgorzata Plechawska-Wójcikm.plechawska@pollub.pl
Department of Computer Science, Lublin University of Technology Poland
Statistics
Abstract views: 566PDF downloads: 366
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.