Analysis of the increase in model forecasting accuracy after data normalization
Article Sidebar
Issue Vol. 16 No. 2 (2026)
-
Performance evaluation of optimized deep learning model with Multilayered Max-Norm Regularization (MMNR) technique for brain tumour classification in MRI multi-modal images
Mulackal Chandran Binish, Vinu Thomas5-14
-
Stroke detection from brain CT-images and its volume visualization
Rithu James, Appukuttan Harsha, Liza Annie Joseph15-21
-
Adaptive filtering for noise reduction in photoplethysmography signals
Hicham Loumissi, Adil Barra, Najat Messaoudi, Othmane El Badlaoui, Bahloul Bensassi, Hicham Medromi22-25
-
Evaluation of informational diagnostic criteria and severity biomarkers using a discrimination model in patients with COVID-19
Gryhoriy Gradil, Oleg Avrunin, Kateryna Yurko, Natalia Shushlyapina, Yuliia Kalashnyk-Vakulenko, Mariia Shostatska, Aigul Iskakova26-31
-
Signal amplifiers in optical communication systems
Nurzhigit Smailov, Nurlybek Turar, Akezhan Sabibolda32-36
-
Analysis of underwater communication systems based on hybrid Li-Fi technology
Nurzhigit Smailov, Aizhan Urazgaliyeva, Akezhan Sabibolda37-43
-
Applying Box-Behnken design to research voice control automatic lighting systems
Oleksandr Burban, Mykola Polishchuk, Anatolii Tkachuk, Serhii Kostiuchko, Liliia Polishchuk, Valentyna Tkachuk44-49
-
Paddy fields detection on Sentinel-2 satellite images using EfficientDet model
Suvarna Vani Koneru, Kamal Epuri, Bhuvanesh Kakumanu, Ram Dinesh Aduri50-55
-
Models for assessing accuracy and reliability of fibre-optic gyroscope-based navigation systems
Maral Abulkhanova, Nurzhigit Smailov, Yerlan Tashtay, Gulbakhar Yussupova, Anar Khabay, Beibarys Sekenov, Akezhan Sabibolda56-60
-
Aggregation of multimodal log and metric streams for neuro-fuzzy anomaly detection in computer systems
Andrii Mishchenko, Oleksii Shushura, Alona Kolomiiets, Andrii Donets, Olena Kosaruk61-67
-
Static forensic analysis of file carving on SSDs uses NIST and ACPO method
Khoirul Anam Dahlan, Anton Yudhana, Herman Yuliansyah68-75
-
Fuzzy logic-based security risk assessment in wireless sensor networks of Industrial IoT
Olena Semenova, Natalia Kryvinska, Olha Voitsekhovska, Andrii Dzhus, Volodymyr Martyniuk76-83
-
Multicriteria optimisation of information protection system configuration based on the NSGA-II algorithm
Valeryi Lakhno, Myroslav Lakhno, Alona Desiatko, Bohdan Bebeshko84-90
-
Method of structural-block coding of tuple transformant video images
Volodymyr Barannik, Dmytro Uzlov, Yevhenii Yelisieiev, Valeriy Barannik, Nina Petrukha, Mykhailo Babenko, Dmitry Barannik, Vladyslav Kostromytskyi, Oleh Kompaniiets, Artem Bychenko91-101
-
Analysis of the increase in model forecasting accuracy after data normalization
Vladyslav Pylypenko, Vladyslava Skidan, Antonina Volivach102-106
-
Optimizing parameters for 4D hyperchaotic system using Walrus Optimizer Algorithm
Karam Adel Abed, Omar Saber Qasim, Saad Fawzi Al-Azzawi107-112
-
Iron coagulation optimization during water treatment using artificial intelligence tools
Andrii Safonyk, Ivan Tarhonii, Oleksandr Naumchuk, Vladyslav Danchenkov, Roman Zaichuk113-117
-
Optimisation of the generating capacity of droop-based DGs integrated into an isolated AC microgrid using metaheuristic algorithms to minimise power losses
Tuan-Ho Le, Tham X. Nguyen, Robert Lis, Muhammad Jamshed Abbass118-125
-
Chemical composition, structural and electrical properties of CdZnTeSe thick polycrystalline films
Yaroslav Znamenshchykov, Oleksii Lisovenko, Mykola Khvyshchun, Anatoliy Opanasyuk126-130
-
Substantiation of a new method for separation of bulk materials on a vibro-friction separator
Mykola Bakum, Serhii Kharchenko, Anatolii Mykhailov, Mykola Krekot, Taras Shchur, Oleg Dzhidzhora131-138
-
Software-based performance evaluation and forecasting of web applications using machine learning models
Liubov Oleshchenko139-144
-
Comparative analysis of Java unit and integration testing tools: JUnit, TestNG and Spock
Dawid Grabek, Jan Gryta, Mariusz Dzieńkowski145-151
-
Application of UML in the development process of computer games
Lyudmila Samchuk, Yuliia Povstiana, Yaroslav Tymoshchuk152-155
-
Design of digital cooking assistant system with modern voice generative AI model
Robert Banasiak, Zdzisława Rowińska, Wojciech Szczucki, Dawid Jantosz, Łukasz Rembowski156-161
-
Deep learning architectures for multiclass clothing recognition as the semantic core of automated virtual try-on systems
Roman Chekhmestruk, Olena Voitsekhovska, Svitlana Kyrylashchuk162-172
-
Knowledge model "Tags about batches and containers" of the ERP system "PlasmIS" with the possibility of self-improvement using local llm models
Oleh Bisikalo, Valerii Starzhynskyi, Tetiana Molodetska, Nelia Burlaka173-178
-
Paradigms of information technology impact on economic education
Artem Yurchenko, Inna Kharchenko, Volodymyr Shamonia, Vladyslav Bespalyi, Serhii Bohoslavskyi, Olena Semenikhina179-186
Archives
-
Vol. 16 No. 2
2026-06-30 27
-
Vol. 16 No. 1
2026-03-30 27
-
Vol. 15 No. 4
2025-12-20 27
-
Vol. 15 No. 3
2025-09-30 24
-
Vol. 15 No. 2
2025-06-27 24
-
Vol. 15 No. 1
2025-03-31 26
-
Vol. 14 No. 4
2024-12-21 25
-
Vol. 14 No. 3
2024-09-30 24
-
Vol. 14 No. 2
2024-06-30 24
-
Vol. 14 No. 1
2024-03-31 23
-
Vol. 13 No. 4
2023-12-20 24
-
Vol. 13 No. 3
2023-09-30 25
-
Vol. 13 No. 2
2023-06-30 14
-
Vol. 13 No. 1
2023-03-31 12
-
Vol. 12 No. 4
2022-12-30 16
-
Vol. 12 No. 3
2022-09-30 15
-
Vol. 12 No. 2
2022-06-30 16
-
Vol. 12 No. 1
2022-03-31 9
Main Article Content
Authors
Abstract
In modern machine learning, data preprocessing is a crucial step that significantly affects the performance of classification models and the accuracy of prediction. This research investigates the impact of data normalization on the prediction accuracy of two common machine learning algorithms: logistic regression and support vector machine (SVM). Data normalization methods, including logarithmic transformation and scaling methods (such as RobustScaler and MinMaxScaler), were applied to evaluate their impact on model accuracy, balanced accuracy, F1 score, and area under the receiver operating characteristic (AUC). The results showed that the normalization process led to a significant improvement in model performance. In particular, logistic regression showed a moderate increase in all key metrics: accuracy improved by 14.6%, balanced accuracy by 12.3%, F1 score by 11.5%, and AUC by 3.9%. In contrast, SVM showed significant improvements: accuracy increased by 42.5%, balanced accuracy by 81.3%, F1 score by 21.3%, and AUC by 117.9%. These results highlight the importance of preprocessing steps for models sensitive to feature scaling, such as SVM. The ROC curves confirmed the improvement in classification performance, with the AUC of SVM increasing from 0.407 to 0.887 after normalization, indicating a shift from a low-performing model to one that more accurately discriminates between positive and negative classes. This improvement demonstrates the effectiveness of normalization in improving the generalization and robustness of the model, especially when handling unbalanced datasets. The result highlights that data normalization is an important step before training and learning a model. It not only improves the accuracy of the model, but also provides better adaptability and stability in real-world applications. And it is also an important step towards achieving reliable and accurate predictions. The results of the study highlight the importance of data preprocessing in machine learning processes, especially for algorithms that depend on feature scale or geometric methods, such as SVM and logistic regression.
Keywords:
References
[1] Adeyemo, N. A., Wimmer, H., & Powell, L. M. (2018). Effects of Normalization Techniques on Logistic Regression in Data Science. 2018 Proceedings of the Conference on Information Systems Applied Research, 11, 1–9.
[2] Ali, P. J. M., & Faraj, R. H. (2014). Data Normalization and Standardization: A Technical Report. Machine Learning Technical Reports, 1(1), 1–6. https://doi.org/10.13140/RG.2.2.28948.04489
[3] Bisong, E. (2019). Introduction to Scikit-learn. In E. Bisong, Building Machine Learning and Deep Learning Models on Google Cloud Platform (pp. 215–229). Apress. https://doi.org/10.1007/978-1-4842-4470-8_18
[4] Booeshaghi, A. S., & Pachter, L. (2021). Normalization of single-cell RNA-seq counts by log(x + 1) or log(1 + x). Bioinformatics, 37(15), 2223–2224. https://doi.org/10.1093/bioinformatics/btab085
[5] Jo, J.-M. (2019). Effectiveness of Normalization Pre-Processing of Big Data to the Machine Learning Performance. The Journal of the Korea Institute of Electronic Communication Sciences, 14(3), 547–552. https://doi.org/10.13067/JKIECS.2019.14.3.547
[6] Musa, A. B. (2013). Comparative study on classification performance between support vector machine and logistic regression. International Journal of Machine Learning and Cybernetics, 4(1), 13–24. https://doi.org/10.1007/s13042-012-0068-x
[7] Pylypenko, V., Statsenko, V., Bila, T., & Statsenko, D. (2024). Determining the influence of data on working with video materials on the accuracy of student success prediction models. Eastern-European Journal of Enterprise Technologies, 5(4 (131)), 52–62. https://doi.org/10.15587/1729-4061.2024.313333
[8] Raju, V. N. G., Lakshmi, K. P., Jain, V. M., Kalidindi, A., & Padma, V. (2020). Study the Influence of Normalization/Transformation process on the Accuracy of Supervised Classification. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), 729–735. https://doi.org/10.1109/ICSSIT48917.2020.9214160
[9] Singh, D., & Singh, B. (2020). Investigating the impact of data normalization on classification performance. Applied Soft Computing, 97, 105524. https://doi.org/10.1016/j.asoc.2019.105524
[10] Sinsomboonthong, S. (2021). Efficiency Comparison in Prediction of Normalization with Data Mining Classification. Advances in Science, Technology and Engineering Systems Journal, 6(4), 130–137. https://doi.org/10.25046/aj060415
[11] Sinsomboonthong, S. (2022). Performance Comparison of New Adjusted Min-Max with Decimal Scaling and Statistical Column Normalization Methods for Artificial Neural Network Classification. International Journal of Mathematics and Mathematical Sciences, 2022, 1–9. https://doi.org/10.1155/2022/3584406
[12] Statsenko, V., & Pylypenko, V. (2024). Assessment of the efficiency of the success prediction model using machine learning methods. Herald of Khmelnytskyi National University. Technical Sciences, 331(1), 271–276. https://doi.org/10.31891/2307-5732-2024-331-41
[13] Sujon, K. M., Hassan, R. B., Towshi, Z. T., Othman, M. A., Samad, M. A., & Choi, K. (2024). When to Use Standardization and Normalization: Empirical Evidence From Machine Learning Models and XAI. IEEE Access, 12, 135300–135314. https://doi.org/10.1109/ACCESS.2024.3462434
[14] Ting, K. M. (2017). Sensitivity and Specificity. In C. Sammut & G. I. Webb (Eds), Encyclopedia of Machine Learning and Data Mining (pp. 1152–1152). Springer US. https://doi.org/10.1007/978-1-4899-7687-1_758
[15] Waskom, M. (2021). Seaborn: Statistical data visualization. Journal of Open Source Software, 6(60), 3021. https://doi.org/10.21105/joss.03021
Article Details
Abstract views: 10

