A COMPARATIVE STUDY ON PERFORMANCE OF BASIC AND ENSEMBLE CLASSIFIERS WITH VARIOUS DATASETS

Archana Gunakala; Afzal Hussain Shahid

doi:10.35784/acs-2023-08

Open full text

PDF

Published: Mar 31, 2023

DOI: https://doi.org/10.35784/acs-2023-08

DOI

https://doi.org/10.35784/acs-2023-08

Authors

Archana Gunakala

archu.gunakala@gmail.com

Research Scholar

https://orcid.org/0000-0002-3375-1893

Afzal Hussain Shahid

syedahshahid@gmail.com

Assistant Professor, Senior (Grade-I)

https://orcid.org/0009-0001-9815-108X

Abstract

Classification plays a critical role in machine learning (ML) systems for processing images, text and high -dimensional data. Predicting class labels from training data is the primary goal of classification. An optimal model for a particular classification problem is chosen on the basis of the model's performance and execution time. This paper compares and analyses the performance of basic as well as ensemble classifiers utilizing 10 -fold cross validation and also discusses their essential concepts, advantages, and disadvantages. In this study five basic classifiers namely Naïve Bayes (NB), Multi-layer Perceptron (MLP), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) and the ensemble of all the five classifiers along with few more combinations are compared with five University of California Irvine (UCI) ML Repository datasets and a Diabetes Health Indicators dataset from kaggle repository. To analyze and compare the performance of classifiers, evaluation metrics like Accuracy, Recall, Precision, Area Under Curve (AUC) and F-Score are used. Experimental results showed that SVM performs best on two out of the six datasets (Diabetes Health Indicators and waveform), RF performs best for Arrhythmia, Sonar, Tic-tac-toe datasets, and the best ensemble combination is found to be DT+SVM+RF on Ionosphere dataset having respective accuracies 72.58%, 90.38%, 81.63%, 73.59%, 94.78% and 94.01% and the proposed ensemble combinations outperformed over the conventional models for few datasets.

Keywords:

Classification, Naïve Bayes, Neural Network, Support Vector Machine, Decision Tree, Ensemble Learning, Random Forest

References

Gunakala, A., & Shahid, A. H. (2023). A COMPARATIVE STUDY ON PERFORMANCE OF BASIC AND ENSEMBLE CLASSIFIERS WITH VARIOUS DATASETS . Applied Computer Science, 19(1), 107–132. https://doi.org/10.35784/acs-2023-08

Article Sidebar

Main Article Content

DOI

Authors

Abstract

Keywords:

References

Article Details

License