Analysis of the increase in model forecasting accuracy after data normalization

Main Article Content

Vladyslav Pylypenko

software.proger@gmail.com

https://orcid.org/0000-0002-2761-4817
Vladyslava Skidan

skidan.vv@knutd.edu.ua

Antonina Volivach

volivach.ap@knutd.edu.ua

https://orcid.org/0000-0002-7119-7774

Abstract

In modern machine learning, data preprocessing is a crucial step that significantly affects the performance of classification models and the accuracy of prediction. This research investigates the impact of data normalization on the prediction accuracy of two common machine learning algorithms: logistic regression and support vector machine (SVM). Data normalization methods, including logarithmic transformation and scaling methods (such as RobustScaler and MinMaxScaler), were applied to evaluate their impact on model accuracy, balanced accuracy, F1 score, and area under the receiver operating characteristic (AUC). The results showed that the normalization process led to a significant improvement in model performance. In particular, logistic regression showed a moderate increase in all key metrics: accuracy improved by 14.6%, balanced accuracy by 12.3%, F1 score by 11.5%, and AUC by 3.9%. In contrast, SVM showed significant improvements: accuracy increased by 42.5%, balanced accuracy by 81.3%, F1 score by 21.3%, and AUC by 117.9%. These results highlight the importance of preprocessing steps for models sensitive to feature scaling, such as SVM. The ROC curves confirmed the improvement in classification performance, with the AUC of SVM increasing from 0.407 to 0.887 after normalization, indicating a shift from a low-performing model to one that more accurately discriminates between positive and negative classes. This improvement demonstrates the effectiveness of normalization in improving the generalization and robustness of the model, especially when handling unbalanced datasets. The result highlights that data normalization is an important step before training and learning a model. It not only improves the accuracy of the model, but also provides better adaptability and stability in real-world applications. And it is also an important step towards achieving reliable and accurate predictions. The results of the study highlight the importance of data preprocessing in machine learning processes, especially for algorithms that depend on feature scale or geometric methods, such as SVM and logistic regression.

Keywords:

AI models, ML algorithms, machine learning, data normalization, C#, Phyton

References

Article Details

Pylypenko, V., Skidan, V., & Volivach, A. (2026). Analysis of the increase in model forecasting accuracy after data normalization. Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 16(2), 102–106. https://doi.org/10.35784/iapgos.8053