Comparison of an effectiveness of artificial neural networks for various activation functions
Article Sidebar
Open full text
Issue Vol. 26 (2023)
-
Augmenting The Cloud Environment Security Through Blockchain Based Hash Algorithms
RAVI KANTH MOTUPALLI1-6
-
Comparison of an effectiveness of artificial neural networks for various activation functions
Daniel Florek, Marek Andrzej Miłosz7-12
-
Performance analysis of user interface implementation methods in mobile applications
Jakub Szczukin13-17
-
Comparative analysis of web applications implemented in: PHP and Python
Jakub Zborowski, Maciej Pańczyk18-22
-
Comparative analysis of social media accessibility
Bartosz Bocheński23-28
-
Comparative analysis of transfer protocols asynchronous messages on systems queuing
Grzegorz Derlatka, Piotr Kopniak29-32
-
Usability analysis of advertising websites interfaces with the use of the universal design principles
Jan Marciniec, Dominik Kondraciuk33-41
-
A comparative analysis of contemporary integrated java environments
Cezary Kaczorowski42-47
-
Choosing the optimal database system to create a CRM system
Łukasz Szwałek, Jakub Smołka48-53
-
Analysis of the usability and accessibility of public transport online timetables in selected cities in Poland
Piotr Wójtowicz, Mariusz Dzieńkowski54-62
-
Analysis of configuration distribution methods in service application environments
Arkadiusz Bryczek, Piotr Kopniak63-67
-
A Researching users' knowledge in the Field of Instant Messengers Security
Yevhenii Tsyliurnyk, Oleksandr Tomenchuk, Grzegorz Kozieł68-74
-
Performance analysis of working with databases with Spring and Symfony
Ewa Wieleba, Bartłomiej Wieleba75-82
-
The analysis of Blender open-source software cloth simulation capabilities
Wojciech Kogut83-87
-
Comparative analysis of selected programming frameworks dedicated to SPA applications
Kinga Dybowska, Małgorzata Plechawska-Wójcik88-92
-
Comparative analysis of code execution time by C and Python based on selected algorithms
Paweł Rysak93-99
Main Article Content
DOI
Authors
Abstract
Activation functions play an important role in artificial neural networks (ANNs) because they break the linearity in the data transformations that are performed by models. Thanks to the recent spike in interest around the topic of ANNs, new improvements to activation functions are emerging. The paper presents the results of research on the effectiveness of ANNs for ReLU, Leaky ReLU, ELU, and Swish activation functions. Four different data sets, and three different network architectures were used. Results show that Leaky ReLU, ELU and Swish functions work better in deep and more complex architectures which are to alleviate vanishing gradient and dead neurons problems but at the cost of prediction speed. Swish function seems to speed up training process considerably but neither of the three aforementioned functions comes ahead in accuracy in all used datasets.
Keywords:
References
A. Abraham, Artificial neural networks. Handbook of measuring system design, John Wiley and Sons Ltd., London (2005) 901-908, https://doi.org/10.1002/0471497398.mm421. DOI: https://doi.org/10.1002/0471497398.mm421
V. Nair, G. E. Hinton, Rectified Linear Units Improve Restricted Boltzmann Machines in Proceedings of the 27th International Conference on International Conference on Machine Learning, Omnipress, Madison (2010) 807-814.
P. Ramachandran, B. Zoph, Q. V. Le, Searching for activation functions, arXiv (2017), https://doi.org/10.48550/arXiv.1710.05941.
A. Krizhevsky, V. Nair, G. E. Hinton, CIFAR-10 and CIFAR-100 datasets http://www.cs.toronto.edu/~kriz/cifar.html , [14.06.2022].
D. A. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (elus), Published as a conference paper at ICLR 2016 (2015), https://doi.org/10.48550/arXiv.1511.07289.
B. Xu, N. Wang, T. Chen, M. Li, Empirical evaluation of rectified activations in convolutional network, arXiv (2015), https://doi.org/10.48550/arXiv.1505.00853.
C. Nwankpa, W. Ijomah, A. Gachagan, S. Marshall, Activation functions: Comparison of trends in practice and research for deep learning, arXiv (2018), https://doi.org/10.48550/arxiv.1811.03378.
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean et al., TensorFlow: A System for Large-Scale Machine Learning, OSDI 16 (2016) 265-283.
Keras, https://keras.io , [14.06.2022].
F. Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12 (2011) 2825-2830, https://doi.org/10.48550/arXiv.1201.0490.
G. Van Rossum, F. L. Drake, Python 3 Reference Manual, CA: CreateSpace, Scotts Valley, 2009.
Anaconda platform website https://anaconda.org/ , [14.06.2022].
Animals 10 dataset https://www.kaggle.com/datasets/alessiocorrado99/animals10 , [14.06.2022].
Intel Image Classification dataset https://www.kaggle.com/datasets/puneet6060/intel-image-classification , [14.06.2022].
K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification, arXiv (2015), https://doi.org/10.48550/arXiv.1502.01852. DOI: https://doi.org/10.1109/ICCV.2015.123
X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Journal of Machine Learning Research - Proceedings Track 9 (2010) 249-256.
S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, International conference on machine learning, PMLR 37 (2015) 448-456, https://doi.org/10.48550/arXiv.1502.03167.
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition (2016) 770-778, https://doi.org/10.48550/arXiv.1512.03385. DOI: https://doi.org/10.1109/CVPR.2016.90
F. Chollet, Xception: Deep learning with depthwise separable convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition (2017) 1251-1258, https://doi.org/10.1109/CVPR.2017.195. DOI: https://doi.org/10.1109/CVPR.2017.195
tf.data.Dataset API https://www.tensorflow.org/api_docs/python/tf/data/Dataset , [20.06.2022].
P. Refaeilzadeh, L. Tang, H. Liu, Cross-Validation. Encyclopedia of Database Systems. Springer, Boston (2009), https://doi.org/10.1007/978-0-387-39940-9_565. DOI: https://doi.org/10.1007/978-0-387-39940-9_565
D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv (2014), https://doi.org/10.48550/arXiv.1412.6980.
Article Details
Abstract views: 303
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Daniel Florek, Lublin University of Technology
inż. Daniel Florek
Marek Andrzej Miłosz, Lublin University of Technology
dr inż. Marek Andrzej Miłosz
