NIEZRÓWNOWAŻONA KLASYFIKACJA WIELOKLASOWA Z ADAPTACYJNYM SYNTETYCZNYM WIELOMIANOWYM NAIWNYM PODEJŚCIEM BAYESA

Fatkhurokhman Fauzi

fatkhurokhmanf@unimus.ac.id
Universitas Muhammadiyah Semarang, Department of Statistics (Indonezja)
https://orcid.org/0000-0002-8277-8638

. Ismatullah


Universitas Muhammadiyah Semarang, Department of Statistics (Indonezja)
http://orcid.org/0009-0005-7472-1761

Indah Manfaati Nur


Universitas Muhammadiyah Semarang, Department of Statistics (Indonezja)
http://orcid.org/0000-0002-1017-7323

Abstrakt

Należy przyjrzeć się i przeanalizować opinie związane z rosnącymi cenami paliw. Opinia publiczna jest ściśle związana z polityką publiczną Indonezji w przyszłości. Twitter jest jednym z mediów, których ludzie używają do przekazywania swoich opinii. Niniejsze badanie wykorzystuje analizę nastrojów, aby przyjrzeć się temu zjawisku. Opinia jest podzielona na trzy kategorie: pozytywną, neutralną i negatywną. Metody wykorzystane w tym badaniu to Adaptive Synthetic Multinomial Naive Bayes, Adaptive Synthetic k-nearest neighbours i Adaptive Synthetic Random Forest. Metoda Adaptive Synthetic służy do obsługi niezrównoważonych danych. Dane wykorzystane w tym badaniu to argumenty publiczne według prowincji w Indonezji. Wyniki uzyskane w tym badaniu to negatywne nastroje, które dominują we wszystkich prowincjach Indonezji. Istnieje związek między negatywnymi nastrojami a poziomem wykształcenia, korzystaniem z Internetu i wskaźnikiem rozwoju społecznego. Adaptive Synthetic Multinomial Naive Bayes działała lepiej niż inne metody, z dokładnością 0,882. Najwyższa dokładność metody Adaptive Synthetic Multinomial Naive Bayes wynosi 0,990 w prowincji Papua Barat.


Słowa kluczowe:

adaptacyjna synteza, klasyfikacja, dane dotyczące nierównowagi, dokładność

Ahuja R. et al.: The Impact of Features Extraction on the Sentiment Analysis. Procedia Computer Science 152, 2019, 341–348 [http://doi.org/10.1016/j.procs.2019.05.008].
DOI: https://doi.org/10.1016/j.procs.2019.05.008   Google Scholar

Ali H. et al.: Deep Learning-Based Election Results Prediction Using Twitter Activity. Soft Computing 26(16), 2022, 7535–43 [http://doi.org/10.1007/s00500-021-06569-5].
DOI: https://doi.org/10.1007/s00500-021-06569-5   Google Scholar

Amity U. et al.: Abstract Proceedings of International Conference on Automation, Computational and Technology Management (ICACTM-2019), 2019.
  Google Scholar

Andrian R. et al.: K-Nearest Neighbor (k-NN) Classification for Recognition of the Batik Lampung Motifs. Journal of Physics: Conference Series 1338(1), 2019 [http://doi.org/10.1088/1742-6596/1338/1/012061].
DOI: https://doi.org/10.1088/1742-6596/1338/1/012061   Google Scholar

Asian J. et al.: Sentiment Analysis for the Brazilian Anesthesiologist Using Multi-Layer Perceptron Classifier and Random Forest Methods. Journal Online Informatika 7(1), 2022, 132 [http://doi.org/10.15575/join.v7i1.900].
DOI: https://doi.org/10.15575/join.v7i1.900   Google Scholar

Balaram A., Vasundra S.: Prediction of Software Fault-Prone Classes Using Ensemble Random Forest with Adaptive Synthetic Sampling Algorithm. Automated Software Engineering 29(1), 2021, 6 [http://doi.org/10.1007/s10515-021-00311-z].
DOI: https://doi.org/10.1007/s10515-021-00311-z   Google Scholar

Budiawan Zulfikar W. et al.: Sentiment Analysis on Social Media Against Public Policy Using Multinomial Naive Bayes. Scientific Journal of Informatics 10(1), 2023 [http://doi.org/10.15294/sji.v10i1.39952].
DOI: https://doi.org/10.15294/sji.v10i1.39952   Google Scholar

Bustillos A. et al.: Approaching Dehumanizing Interactions: Joint Consideration of Other-, Meta-, and Self-Dehumanization. Current Opinion in Behavioral Sciences 49, 2023, 101233 [http://doi.org/10.1016/j.cobeha.2022.101233].
DOI: https://doi.org/10.1016/j.cobeha.2022.101233   Google Scholar

Eberwein T.: ‘Trolls’ or ‘Warriors of Faith’?: Differentiating Dysfunctional Forms of Media Criticism in Online Comments. Journal of Information, Communication and Ethics in Society 18(1), 2020, 131–143 [http://doi.org/10.1108/JICES-08-2019-0090].
DOI: https://doi.org/10.1108/JICES-08-2019-0090   Google Scholar

Farisi A. A. et al.: Sentiment Analysis on Hotel Reviews Using Multinomial Naive Bayes Classifier. Journal of Physics: Conference Series 1192(1), 2019 [http://doi.org/10.1088/1742-6596/1192/1/012024].
DOI: https://doi.org/10.1088/1742-6596/1192/1/012024   Google Scholar

Gazali Mahmud F. et al.: Implementation Of K-Nearest Neighbor Algorithm With SMOTE For Hotel Reviews Sentiment Analysis. Sinkron: Jurnal Dan Penelitian Teknik Informatika 8(2), 2023, 595–602 [http://doi.org/10.33395/sinkron.v8i2.12214].
DOI: https://doi.org/10.33395/sinkron.v8i2.12214   Google Scholar

Ghosh D., Cabrera J.: Enriched Random Forest for High Dimensional Genomic Data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 19(5), 2022, 2817–2828 [http://doi.org/10.1109/TCBB.2021.3089417].
DOI: https://doi.org/10.1109/TCBB.2021.3089417   Google Scholar

Hasdyna N. et al.: Improving the Performance of K-Nearest Neighbor Algorithm by Reducing the Attributes of Dataset Using Gain Ratio. Journal of Physics: Conference Series 1566(1), 2020 [http://doi.org/10.1088/1742-6596/1566/1/012090].
DOI: https://doi.org/10.1088/1742-6596/1566/1/012090   Google Scholar

He H. et al.: ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008, 1322–1328 [http://doi.org/10.1109/IJCNN.2008.4633969].
DOI: https://doi.org/10.1109/IJCNN.2008.4633969   Google Scholar

Herhianto A.: Sentiment Analysis Menggunakan Naive Bayes Classifier (Nbc) Pada Tweet Tentang Zakat. 2020.
  Google Scholar

Hossain E. et al.: Sentiment Polarity Detection on Bengali Book Reviews Using Multinomial Naive Bayes. Progress in Advanced Computing and Intelligent Engineering (ed.Chhabi Rani Panigrahi et al.), Springer Singapore, 2021, 281–292.
DOI: https://doi.org/10.1007/978-981-33-4299-6_23   Google Scholar

Hu Z. et al.: A Novel Wireless Network Intrusion Detection Method Based on Adaptive Synthetic Sampling and an Improved Convolutional Neural Network. IEEE Access 8, 2020, 195741–195751 [http://doi.org/10.1109/ACCESS.2020.3034015].
DOI: https://doi.org/10.1109/ACCESS.2020.3034015   Google Scholar

Jalilifard A. et al.: Semantic Sensitive TF-IDF to Determine Word Relevance in Documents, 2020 [http://doi.org/10.1007/978-981-33-6977-1].
DOI: https://doi.org/10.1007/978-981-33-6987-0_27   Google Scholar

Jiang C. et al.: Benchmarking State-of-the-Art Imbalanced Data Learning Approaches for Credit Scoring. Expert Systems with Applications 213, 2023, 118878 [http://doi.org/10.1016/j.eswa.2022.118878].
DOI: https://doi.org/10.1016/j.eswa.2022.118878   Google Scholar

Koh J. E. W. et al: Automated Classification of Attention Deficit Hyperactivity Disorder and Conduct Disorder Using Entropy Features with ECG Signals. Computers in Biology and Medicine 140, 2022, 105120 [http://doi.org/10.1016/j.compbiomed.2021.105120].
DOI: https://doi.org/10.1016/j.compbiomed.2021.105120   Google Scholar

Kurniasih A., Lindung P. M.: On the Role of Text Preprocessing in BERT Embedding-Based DNNs for Classifying Informal Texts. International Journal of Advanced Computer Science and Applications 13(6), 2022, 927–934 [http://doi.org/10.14569/IJACSA.2022.01306109].
DOI: https://doi.org/10.14569/IJACSA.2022.01306109   Google Scholar

Kurniawati Y. E. et al.: Adaptive Synthetic-Nominal (ADASYN-N) and Adaptive Synthetic-KNN (ADASYN-KNN) for Multiclass Imbalance Learning on Laboratory Test Data. 2018 4th International Conference on Science and Technology (ICST), 2018, 1–6 [http://doi.org/10.1109/ICSTC.2018.8528679].
DOI: https://doi.org/10.1109/ICSTC.2018.8528679   Google Scholar

Leelawat N. et al.: Twitter Data Sentiment Analysis of Tourism in Thailand during the COVID-19 Pandemic Using Machine Learning. Heliyon 8(10), 2022, e10894 [http://doi.org/10.1016/j.heliyon.2022.e10894].
DOI: https://doi.org/10.1016/j.heliyon.2022.e10894   Google Scholar

Liu J. et al.: A Fast Network Intrusion Detection System Using Adaptive Synthetic Oversampling and LightGBM. Computers & Security 106, 2021, 102289 [http://doi.org/10.1016/j.cose.2021.102289].
DOI: https://doi.org/10.1016/j.cose.2021.102289   Google Scholar

Liu Y., Wu H.: Prediction of Road Traffic Congestion Based on Random Forest. 2017 10th International Symposium on Computational Intelligence and Design (ISCID) 2, 2017, 361–364 [http://doi.org/10.1109/ISCID.2017.216].
DOI: https://doi.org/10.1109/ISCID.2017.216   Google Scholar

Lytvyn V. et al.: Identifying Textual Content Based on Thematic Analysis of Similar Texts in Big Data. 2019 IEEE 14th International Conference on Computer Sciences and Information Technologies (CSIT) 2, 2019, 84–91 [http://doi.org/10.1109/STC-CSIT.2019.8929808].
DOI: https://doi.org/10.1109/STC-CSIT.2019.8929808   Google Scholar

Mayo M.: A General Approach to Preprocessing Text Data, 2017.
  Google Scholar

Moosavian A. et al.: Comparison of Two Classifiers; K-Nearest Neighbor and Artificial Neural Network, for Fault Diagnosis on a Main Engine Journal-Bearing. Shock and Vibration 20(2), 2013, 263–272 [http://doi.org/10.3233/SAV-2012-00742].
DOI: https://doi.org/10.1155/2013/360236   Google Scholar

Nadhifah D. et al.: Analysis of the Impact of the Increase in Fuel Oil (BBM) on Household Economic Activities. Journal of Contemporary Gender and Child Studies (JCGCS) 1(1), 2022 [https://zia-research.com/index.php/jcgcs].
DOI: https://doi.org/10.61253/jcgcs.v1i1.54   Google Scholar

Nazrul Syed S.: Multinomial Naive Bayes Classifier for Text Analysis (Python). Towards Data Science, 2018.
  Google Scholar

Patel A. et al.: Sentiment Analysis of Customer Feedback and Reviews for Airline Services Using Language Representation Model. Procedia Computer Science 218, 2023, 2459–2467 [http://doi.org/10.1016/j.procs.2023.01.221].
DOI: https://doi.org/10.1016/j.procs.2023.01.221   Google Scholar

Rahman R. et al.: Sentiment Analysis on Bengali Movie Reviews Using Multinomial Naive Bayes. 2021 24th International Conference on Computer and Information Technology (ICCIT), 2021, 1–6 [http://doi.org/10.1109/ICCIT54785.2021.9689787].
DOI: https://doi.org/10.1109/ICCIT54785.2021.9689787   Google Scholar

Rennie J. D. M. et al.: Tackling the Poor Assumptions of Naive Bayes Text Classifiers, 2003.
  Google Scholar

Ridho Lubis A. et al.: The Effect of the TF-IDF Algorithm in Times Series in Forecasting Word on Social Media. Indonesian Journal of Electrical Engineering and Computer Science 22(2), 2021, 976 [http://doi.org/10.11591/ijeecs.v22.i2.pp976-984].
DOI: https://doi.org/10.11591/ijeecs.v22.i2.pp976-984   Google Scholar

Sahib N. G. et al.: Sentiment Analysis of Social Media Comments in Mauritius. IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), 2023, 860–865 [http://doi.org/10.1109/CCWC57344.2023.10099291].
DOI: https://doi.org/10.1109/CCWC57344.2023.10099291   Google Scholar

Salauddin Khan M. et al.: Comparison of Multiclass Classification Techniques Using Dry Bean Dataset. International Journal of Cognitive Computing in Engineering 4, 2023, 6–20 [http://doi.org/10.1016/j.ijcce.2023.01.002].
DOI: https://doi.org/10.1016/j.ijcce.2023.01.002   Google Scholar

Solikah M., Dian N.: The Effectiveness of the Guided Inquiries Learning Model on the Critical Thinking Ability of Students. Jurnal Pijar Mipa 17(2), 2022, 184–191 [http://doi.org/10.29303/jpm.v17i2.3276].
DOI: https://doi.org/10.29303/jpm.v17i2.3276   Google Scholar

Surya P. P. et al.: Analysis of User Emotions and Opinion Using Multinomial Naive Bayes Classifier. 2019 3rd International Conference on Electronics, Communication and Aerospace Technology (ICECA), 2019, 410–415 [http://doi.org/10.1109/ICECA.2019.8822096].
DOI: https://doi.org/10.1109/ICECA.2019.8822096   Google Scholar

Yang J. et al.: Delineation of Urban Growth Boundaries Using a Patch-Based Cellular Automata Model under Multiple Spatial and Socio-Economic Scenarios. Sustainability (Switzerland) 11(21), 2019 [http://doi.org/10.3390/su11216159].
DOI: https://doi.org/10.3390/su11216159   Google Scholar

Yu B. et al.: Classification Method for Failure Modes of RC Columns Based on Class-Imbalanced Datasets. Structures 48, 2023, 694–705 [http://doi.org/10.1016/j.istruc.2022.12.063].
DOI: https://doi.org/10.1016/j.istruc.2022.12.063   Google Scholar

Zamsuri A. et al.: Classification of Multiple Emotions in Indonesian Text Using The K-Nearest Neighbor Method. Journal of Applied Engineering and Technological Science (JAETS) 4(2), 2023, 1012–1021 [http://doi.org/10.37385/jaets.v4i2.1964].
DOI: https://doi.org/10.37385/jaets.v4i2.1964   Google Scholar

Zhai J. et al.: Binary Imbalanced Data Classification Based on Diversity Oversampling by Generative Models. Information Sciences 585, 2022, 313–43 [http://doi.org/10.1016/j.ins.2021.11.058].
DOI: https://doi.org/10.1016/j.ins.2021.11.058   Google Scholar


Opublikowane
2023-09-30

Cited By / Share

Fauzi, F., Ismatullah, ., & Manfaati Nur, I. (2023). NIEZRÓWNOWAŻONA KLASYFIKACJA WIELOKLASOWA Z ADAPTACYJNYM SYNTETYCZNYM WIELOMIANOWYM NAIWNYM PODEJŚCIEM BAYESA. Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 13(3), 64–70. https://doi.org/10.35784/iapgos.3740

Autorzy

Fatkhurokhman Fauzi 
fatkhurokhmanf@unimus.ac.id
Universitas Muhammadiyah Semarang, Department of Statistics Indonezja
https://orcid.org/0000-0002-8277-8638

Autorzy

. Ismatullah 

Universitas Muhammadiyah Semarang, Department of Statistics Indonezja
http://orcid.org/0009-0005-7472-1761

Autorzy

Indah Manfaati Nur 

Universitas Muhammadiyah Semarang, Department of Statistics Indonezja
http://orcid.org/0000-0002-1017-7323

Statystyki

Abstract views: 176
PDF downloads: 116