HOW MACHINE LEARNING ALGORITHMS ARE USED IN METEOROLOGICAL DATA CLASSIFICATION: A COMPARATIVE APPROACH BETWEEN DT, LMT, M5-MT, GRADIENT BOOSTING AND GWLM-NARX MODELS
Sheikh Amir FAYAZ
skh.amir88@gmail.comDepartment of Computer Sciences, University of Kashmir, J&K (India)
Majid ZAMAN
Directorate of IT & SS, University of Kashmir, J&K (India)
Muheet Ahmed BUTT
Department of Computer Sciences, University of Kashmir, J&K (India)
Sameer KAUL
Department of Computer Sciences, University of Kashmir, J&K (India)
Abstract
Rainfall prediction is one of the most challenging task faced by researchers over the years. Many machine learning and AI based algorithms have been implemented on different datasets for better prediction purposes, but there is not a single solution which perfectly predicts the rainfall. Accurate prediction still remains a question to researchers. We offer a machine learning-based comparison evaluation of rainfall models for Kashmir province. Both local geographic features and the time horizon has influence on weather forecasting. Decision trees, Logistic Model Trees (LMT), and M5 model trees are examples of predictive models based on algorithms. GWLM-NARX, Gradient Boosting, and other techniques were investigated. Weather predictors measured from three major meteorological stations in the Kashmir area of the UT of J&K, India, were utilized in the models. We compared the proposed models based on their accuracy, kappa, interpretability, and other statistics, as well as the significance of the predictors utilized. On the original dataset, the DT model delivers an accuracy of 80.12 percent, followed by the LMT and Gradient boosting models, which produce accuracy of 87.23 percent and 87.51 percent, respectively. Furthermore, when continuous data was used in the M5-MT and GWLM-NARX models, the NARX model performed better, with mean squared error (MSE) and regression value (R) predictions of 3.12 percent and 0.9899 percent in training, 0.144 percent and 0.9936 percent in validation, and 0.311 percent and 0.9988 percent in testing.
Keywords:
Meteorological data, M5 model tree, Linear model functions, Gradient boosting, Logistic Model treesReferences
Adnan, R. M., Petroselli, A., Heddam, S., Santos, C. A. G., & Kisi, O. (2021). Comparison of different methodologies for rainfall–runoff modeling: machine learning vs conceptual approach. Natural Hazards, 105(3), 2987–3011.
DOI: https://doi.org/10.1007/s11069-020-04438-2
Google Scholar
Afolayan, H. A., Ojokoh, B. A., & Falaki, S. O. (2016). Comparative analysis of rainfall prediction models using neural network and fuzzy logic. International Journal of Soft Computing and Engineering, 5(6), 4–7.
Google Scholar
Aftab, S., Ahmad, M., Hameed, N., Bashir, M. S., Ali, I., & Nawaz, Z. (2018). Rainfall prediction using data mining techniques: A systematic literature review. International journal of advanced computer science and applications, 9(5), 143–150.
DOI: https://doi.org/10.14569/IJACSA.2018.090518
Google Scholar
Aguasca-Colomo, R., Castellanos-Nieves, D., & Méndez, M. (2019). Comparative Analysis of Rainfall Prediction Models Using Machine Learning in Islands with Complex Orography: Tenerife Island. Applied Sciences, 9(22), 4931. https://doi.org/10.3390/app9224931
DOI: https://doi.org/10.3390/app9224931
Google Scholar
Altaf, I., Butt, M. A., & Zaman, M. (2021). A Pragmatic Comparison of Supervised Machine Learning Classifiers for Disease Diagnosis. In 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 1515–1520). IEEE. https://doi.org/10.1109/ICIRCA51532.2021.9544582
DOI: https://doi.org/10.1109/ICIRCA51532.2021.9544582
Google Scholar
Banday, I.R., Zaman, M., Quadri, S.M.K., Fayaz, S.A., Butt, M.A. (2022). Big data in academia: A proposed framework for improving students performance. Revue d'Intelligence Artificielle, Vol. 36, No. 4, pp. 589–595. https://doi.org/10.18280/ria.360411
DOI: https://doi.org/10.18280/ria.360411
Google Scholar
Barrera–Animas, A. Y., Oyedele, L. O., Bilal, M., Akinosho, T. D., Delgado, J. M. D., & Akanbi, L. A. (2022). Rainfall prediction: A comparative analysis of modern machine learning algorithms for time-series forecasting. Machine Learning with Applications, 7, 100204. https://doi.org/10.1016/j.mlwa.2021.100204
DOI: https://doi.org/10.1016/j.mlwa.2021.100204
Google Scholar
Dhamodaran, S., & Lakshmi, M. (2021). Comparative analysis of spatial interpolation with climatic changes using inverse distance method. Journal of Ambient Intelligence and Humanized Computing, 12(6), 6725–6734. https://doi.org/10.1007/s12652-020-02296-1
DOI: https://doi.org/10.1007/s12652-020-02296-1
Google Scholar
Fayaz, S. A., Kaul, S., Zaman, M., & Butt, M. A. (2022). An adaptive gradient boosting model for the prediction of rainfall using ID3 as a base estimator. Revue d'Intelligence Artificielle, 36(2), 241–250. https://doi.org/10.18280/ria.360208
DOI: https://doi.org/10.18280/ria.360208
Google Scholar
Fayaz, S. A., Zaman, M., & Butt, M. A. (2021a). To ameliorate classification accuracy using ensemble distributed decision tree (DDT) vote approach: An empirical discourse of geographical data mining. Procedia Computer Science, 184, 935–940. https://doi.org/10.1016/j.procs.2021.03.116
DOI: https://doi.org/10.1016/j.procs.2021.03.116
Google Scholar
Fayaz, S. A., Zaman, M., & Butt, M. A. (2021b). An application of logistic model tree (LMT) algorithm to ameliorate Prediction accuracy of meteorological data. International Journal of Advanced Technology and Engineering Exploration, 8(84), 1424–40.
DOI: https://doi.org/10.19101/IJATEE.2021.874586
Google Scholar
Fayaz, S. A., Zaman, M., & Butt, M. A. (2021c). A hybrid adaptive grey wolf Levenberg-Marquardt (GWLM) and nonlinear autoregressive with exogenous input (NARX) neural network model for the prediction of rainfall. International Journal of Advanced Technology and Engineering Exploration, 9(89), 509–522. https://doi.org/10.19101/IJATEE.2021.874647
DOI: https://doi.org/10.19101/IJATEE.2021.874647
Google Scholar
Fayaz, S. A., Zaman, M., & Butt, M. A. (2022a). Numerical and Experimental Investigation of Meteorological Data Using Adaptive Linear M5 Model Tree for the Prediction of Rainfall. Review of Computer Engineering Research, 9(1), 1–12.
DOI: https://doi.org/10.18488/76.v9i1.2961
Google Scholar
Fayaz, S. A., Zaman, M., & Butt, M. A. (2022b). Knowledge Discovery in Geographical Sciences—A Systematic Survey of Various Machine Learning Algorithms for Rainfall Prediction. In International Conference on Innovative Computing and Communications (pp. 593–608). Springer.
DOI: https://doi.org/10.1007/978-981-16-2597-8_51
Google Scholar
Fayaz, S. A., Zaman, M., & Butt, M. A. (2022c). Performance Evaluation of GINI Index and Information Gain Criteria on Geographical Data: An Empirical Study Based on JAVA and Python. In International Conference on Innovative Computing and Communications (pp. 249–265). Springer.
DOI: https://doi.org/10.1007/978-981-16-3071-2_22
Google Scholar
Fayaz, S. A., Zaman, M., Kaul, S., & Butt, M. A. (2022). Is Deep Learning on Tabular Data Enough? An Assessment. International Journal of Advanced Computer Science and Applications, 13(4), 2022. http://dx.doi.org/10.14569/IJACSA.2022.0130454
DOI: https://doi.org/10.14569/IJACSA.2022.0130454
Google Scholar
Kaul, S., Fayaz, S. A., Zaman, M., & Butt, M. A. (2022). Is decision tree obsolete in its original form? A burning debate. Revue d'Intelligence Artificielle, 36(1), 105–113.
DOI: https://doi.org/10.18280/ria.360112
Google Scholar
Kaul, S., Zaman, M., Fayaz, S. A., & Butt, M. A. (2023). Performance Stagnation of Meteorological Data of Kashmir. In International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems (vol. 471). Springer. https://doi.org/10.1007/978-981-19-2535-1_63
DOI: https://doi.org/10.1007/978-981-19-2535-1_63
Google Scholar
Mohd, R., Butt, M. A., & Baba, M. Z. (2020). GWLM–NARX: grey wolf levenberg–marquardt-based neural network for rainfall prediction. Data Technologies and Applications, 54(1), 85–102. https://doi.org/10.1108/DTA-08-2019-0130. 2020.
DOI: https://doi.org/10.1108/DTA-08-2019-0130
Google Scholar
Mohd, R., Butt, M. A., & Baba, M. Z. (2022). Grey Wolf-Based Linear Regression Model for Rainfall Prediction. International Journal of Information Technologies and Systems Approach, 15(1), 1-18.
DOI: https://doi.org/10.4018/IJITSA.290004
Google Scholar
Niu, J., & Zhang, W. (2015). Comparative analysis of statistical models in rainfall prediction. In 2015 IEEE International Conference on Information and Automation (pp. 2187-2190). IEEE.
DOI: https://doi.org/10.1109/ICInfA.2015.7279650
Google Scholar
Pucheta, J. A., Cristian, M. R. R., Martín, R. H., Carlos, A. S., Patiño, H. D., & Benjamín, R. K. (2009). A feedforward neural networks-based nonlinear autoregressive model for forecasting time series. Comput y Sistemas, 14(4), 423–435.
Google Scholar
Rezaie-balf, M., Naganna, S. R., Ghaemi, A., & Deka, P. C. (2017). Wavelet coupled MARS and M5 Model Tree approaches for groundwater level forecasting. Journal of hydrology, 553, 356–373.
DOI: https://doi.org/10.1016/j.jhydrol.2017.08.006
Google Scholar
Singh, P., & Borah, B. (2013). Indian summer monsoon rainfall prediction using artificial neural network. Stochastic Environmental Research and Risk Assessment, 27(7), 1585–1599.
DOI: https://doi.org/10.1007/s00477-013-0695-0
Google Scholar
Singh, U., Chauhan, S., Krishnamachari, A., & Vig, L. (2015). Ensemble of deep long short term memory networks for labelling origin of replication sequences. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA) (pp.1–7). IEEE. http://dx.doi.org/10.1109/DSAA.2015.7344871
DOI: https://doi.org/10.1109/DSAA.2015.7344871
Google Scholar
Wu, C., & Chau, K.-W. (2013). Prediction of rainfall time series using modular soft computing methods. Engineering Applications of Artificial Intelligence, 26(3), 997–1007. https://doi.org/10.1016/j.engappai.2012.05.023
DOI: https://doi.org/10.1016/j.engappai.2012.05.023
Google Scholar
Xiang, Y., Gou, L., He, L., Xia, S., & Wang, W. (2018). A SVR–ANN combined model based on ensemble EMD for rainfall prediction. Applied Soft Computing, 73, 874–883. https://doi.org/10.1016/j.asoc.2018.09.018
DOI: https://doi.org/10.1016/j.asoc.2018.09.018
Google Scholar
Yang, Y., Lin, H., Guo, Z., & Jiang, J. (2007). A data mining approach for heavy rainfall forecasting based on satellite image sequence analysis. Comput Geosci, 33(1), 20–30.
DOI: https://doi.org/10.1016/j.cageo.2006.05.010
Google Scholar
Zaman, M., & Butt, M. A. (2012). Information translation: a practitioners approach. In World Congress on Engineering and Computer Science (WCECS).
Google Scholar
Zaz, S. N., Romshoo, S. A., Krishnamoorthy, R. T., & Viswanadhapalli, Y. (2019). Analyses of temperature and precipitation in the Indian Jammu and Kashmir region for the 1980–2016 period: implications for remote influence and extreme events. Atmospheric Chemistry and Physics, 19(1), 15-37. https://doi.org/10.5194/acp-19-15-2019
DOI: https://doi.org/10.5194/acp-19-15-2019
Google Scholar
Authors
Sheikh Amir FAYAZskh.amir88@gmail.com
Department of Computer Sciences, University of Kashmir, J&K India
Authors
Majid ZAMANDirectorate of IT & SS, University of Kashmir, J&K India
Authors
Muheet Ahmed BUTTDepartment of Computer Sciences, University of Kashmir, J&K India
Authors
Sameer KAULDepartment of Computer Sciences, University of Kashmir, J&K India
Statistics
Abstract views: 264PDF downloads: 119
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Most read articles by the same author(s)
- Lubna RIYAZ, Muheet Ahmed BUTT, Majid ZAMAN, IMPROVING CORONARY HEART DISEASE PREDICTION BY OUTLIER ELIMINATION , Applied Computer Science: Vol. 18 No. 1 (2022)
Similar Articles
- Workineh TESEMA, INEFFICIENCY OF DATA MINING ALGORITHMS AND ITS ARCHITECTURE: WITH EMPHASIS TO THE SHORTCOMING OF DATA MINING ALGORITHMS ON THE OUTPUT OF THE RESEARCHES , Applied Computer Science: Vol. 15 No. 3 (2019)
- Leszek JASKIERNY, REVIEW OF THE DATA MODELING STANDARDS AND DATA MODEL TRANSFORMATION TECHNIQUES , Applied Computer Science: Vol. 14 No. 4 (2018)
- Jolanta BRZOZOWSKA, Jakub PIZOŃ, Gulzhan BAYTIKENOVA, Arkadiusz GOLA, Alfiya ZAKIMOVA, Katarzyna PIOTROWSKA, DATA ENGINEERING IN CRISP-DM PROCESS PRODUCTION DATA – CASE STUDY , Applied Computer Science: Vol. 19 No. 3 (2023)
- Raphael Olufemi AKINYEDE, Sulaiman Omolade ADEGBENRO, Babatola Moses OMILODI, A SECURITY MODEL FOR PREVENTING E-COMMERCE RELATED CRIMES , Applied Computer Science: Vol. 16 No. 3 (2020)
- K. Raju, Niranjan N Chiplunkar, PERFORMANCE ENHANCEMENT OF CUDA APPLICATIONS BY OVERLAPPING DATA TRANSFER AND KERNEL EXECUTION , Applied Computer Science: Vol. 17 No. 3 (2021)
- Toufik GHRIB, Yacine KHALDI, Purnendu Shekhar PANDEY, Yusef Awad ABUSAL, ADVANCED FRAUD DETECTION IN CARD-BASED FINANCIAL SYSTEMS USING A BIDIRECTIONAL LSTM-GRU ENSEMBLE MODEL , Applied Computer Science: Vol. 20 No. 3 (2024)
- Shadan Mohammed Jihad ABDALWAHID, Raghad Zuhair YOUSIF, Shahab Wahhab KAREEM, ENHANCING APPROACH USING HYBRID PAILLER AND RSA FOR INFORMATION SECURITY IN BIGDATA , Applied Computer Science: Vol. 15 No. 4 (2019)
- Siti ROHAJAWATI, Hutanti SETYODEWI, Ferryansyah Muji Agustian TRESNANTO, Debora MARIANTHI, Maruli Tua Baja SIHOTANG , KNOWLEDGE MANAGEMENT APPROACH IN COMPARATIVE STUDY OF AIR POLLUTION PREDICTION MODEL , Applied Computer Science: Vol. 20 No. 1 (2024)
- Mohammed Chachan YOUNIS, PREDICTION OF PATIENT’S WILLINGNESS FOR TREATMENT OF MENTAL ILLNESS USING MACHINE LEARNING APPROACHES , Applied Computer Science: Vol. 20 No. 2 (2024)
- Amina ALYAMANI, Oleh YASNIY, CLASSIFICATION OF EEG SIGNAL BY METHODS OF MACHINE LEARNING , Applied Computer Science: Vol. 16 No. 4 (2020)
You may also start an advanced similarity search for this article.