HOW MACHINE LEARNING ALGORITHMS ARE USED IN METEOROLOGICAL DATA CLASSIFICATION: A COMPARATIVE APPROACH BETWEEN DT, LMT, M5-MT, GRADIENT BOOSTING AND GWLM-NARX MODELS

Sheikh Amir FAYAZ

skh.amir88@gmail.com
Department of Computer Sciences, University of Kashmir, J&K (India)

Majid ZAMAN


Directorate of IT & SS, University of Kashmir, J&K (India)

Muheet Ahmed BUTT


Department of Computer Sciences, University of Kashmir, J&K (India)

Sameer KAUL


Department of Computer Sciences, University of Kashmir, J&K (India)

Abstract

Rainfall prediction is one of the most challenging task faced by researchers over the years. Many machine learning and AI based algorithms have been implemented on different datasets for better prediction purposes, but there is not a single solution which perfectly predicts the rainfall. Accurate prediction still remains a question to researchers. We offer a machine learning-based comparison evaluation of rainfall models for Kashmir province. Both local geographic features and the time horizon has influence on weather forecasting. Decision trees, Logistic Model Trees (LMT), and M5 model trees are examples of predictive models based on algorithms. GWLM-NARX, Gradient Boosting, and other techniques were investigated. Weather predictors measured from three major meteorological stations in the Kashmir area of the UT of J&K, India, were utilized in the models. We compared the proposed models based on their accuracy, kappa, interpretability, and other statistics, as well as the significance of the predictors utilized. On the original dataset, the DT model delivers an accuracy of 80.12 percent, followed by the LMT and Gradient boosting models, which produce accuracy of 87.23 percent and 87.51 percent, respectively. Furthermore, when continuous data was used in the M5-MT and GWLM-NARX models, the NARX model performed better, with mean squared error (MSE) and regression value (R) predictions of 3.12 percent and 0.9899 percent in training, 0.144 percent and 0.9936 percent in validation, and 0.311 percent and 0.9988 percent in testing.


Keywords:

Meteorological data, M5 model tree, Linear model functions, Gradient boosting, Logistic Model trees

Adnan, R. M., Petroselli, A., Heddam, S., Santos, C. A. G., & Kisi, O. (2021). Comparison of different methodologies for rainfall–runoff modeling: machine learning vs conceptual approach. Natural Hazards, 105(3), 2987–3011.
DOI: https://doi.org/10.1007/s11069-020-04438-2   Google Scholar

Afolayan, H. A., Ojokoh, B. A., & Falaki, S. O. (2016). Comparative analysis of rainfall prediction models using neural network and fuzzy logic. International Journal of Soft Computing and Engineering, 5(6), 4–7.
  Google Scholar

Aftab, S., Ahmad, M., Hameed, N., Bashir, M. S., Ali, I., & Nawaz, Z. (2018). Rainfall prediction using data mining techniques: A systematic literature review. International journal of advanced computer science and applications, 9(5), 143–150.
DOI: https://doi.org/10.14569/IJACSA.2018.090518   Google Scholar

Aguasca-Colomo, R., Castellanos-Nieves, D., & Méndez, M. (2019). Comparative Analysis of Rainfall Prediction Models Using Machine Learning in Islands with Complex Orography: Tenerife Island. Applied Sciences, 9(22), 4931. https://doi.org/10.3390/app9224931
DOI: https://doi.org/10.3390/app9224931   Google Scholar

Altaf, I., Butt, M. A., & Zaman, M. (2021). A Pragmatic Comparison of Supervised Machine Learning Classifiers for Disease Diagnosis. In 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 1515–1520). IEEE. https://doi.org/10.1109/ICIRCA51532.2021.9544582
DOI: https://doi.org/10.1109/ICIRCA51532.2021.9544582   Google Scholar

Banday, I.R., Zaman, M., Quadri, S.M.K., Fayaz, S.A., Butt, M.A. (2022). Big data in academia: A proposed framework for improving students performance. Revue d'Intelligence Artificielle, Vol. 36, No. 4, pp. 589–595. https://doi.org/10.18280/ria.360411
DOI: https://doi.org/10.18280/ria.360411   Google Scholar

Barrera–Animas, A. Y., Oyedele, L. O., Bilal, M., Akinosho, T. D., Delgado, J. M. D., & Akanbi, L. A. (2022). Rainfall prediction: A comparative analysis of modern machine learning algorithms for time-series forecasting. Machine Learning with Applications, 7, 100204. https://doi.org/10.1016/j.mlwa.2021.100204
DOI: https://doi.org/10.1016/j.mlwa.2021.100204   Google Scholar

Dhamodaran, S., & Lakshmi, M. (2021). Comparative analysis of spatial interpolation with climatic changes using inverse distance method. Journal of Ambient Intelligence and Humanized Computing, 12(6), 6725–6734. https://doi.org/10.1007/s12652-020-02296-1
DOI: https://doi.org/10.1007/s12652-020-02296-1   Google Scholar

Fayaz, S. A., Kaul, S., Zaman, M., & Butt, M. A. (2022). An adaptive gradient boosting model for the prediction of rainfall using ID3 as a base estimator. Revue d'Intelligence Artificielle, 36(2), 241–250. https://doi.org/10.18280/ria.360208
DOI: https://doi.org/10.18280/ria.360208   Google Scholar

Fayaz, S. A., Zaman, M., & Butt, M. A. (2021a). To ameliorate classification accuracy using ensemble distributed decision tree (DDT) vote approach: An empirical discourse of geographical data mining. Procedia Computer Science, 184, 935–940. https://doi.org/10.1016/j.procs.2021.03.116
DOI: https://doi.org/10.1016/j.procs.2021.03.116   Google Scholar

Fayaz, S. A., Zaman, M., & Butt, M. A. (2021b). An application of logistic model tree (LMT) algorithm to ameliorate Prediction accuracy of meteorological data. International Journal of Advanced Technology and Engineering Exploration, 8(84), 1424–40.
DOI: https://doi.org/10.19101/IJATEE.2021.874586   Google Scholar

Fayaz, S. A., Zaman, M., & Butt, M. A. (2021c). A hybrid adaptive grey wolf Levenberg-Marquardt (GWLM) and nonlinear autoregressive with exogenous input (NARX) neural network model for the prediction of rainfall. International Journal of Advanced Technology and Engineering Exploration, 9(89), 509–522. https://doi.org/10.19101/IJATEE.2021.874647
DOI: https://doi.org/10.19101/IJATEE.2021.874647   Google Scholar

Fayaz, S. A., Zaman, M., & Butt, M. A. (2022a). Numerical and Experimental Investigation of Meteorological Data Using Adaptive Linear M5 Model Tree for the Prediction of Rainfall. Review of Computer Engineering Research, 9(1), 1–12.
DOI: https://doi.org/10.18488/76.v9i1.2961   Google Scholar

Fayaz, S. A., Zaman, M., & Butt, M. A. (2022b). Knowledge Discovery in Geographical Sciences—A Systematic Survey of Various Machine Learning Algorithms for Rainfall Prediction. In International Conference on Innovative Computing and Communications (pp. 593–608). Springer.
DOI: https://doi.org/10.1007/978-981-16-2597-8_51   Google Scholar

Fayaz, S. A., Zaman, M., & Butt, M. A. (2022c). Performance Evaluation of GINI Index and Information Gain Criteria on Geographical Data: An Empirical Study Based on JAVA and Python. In International Conference on Innovative Computing and Communications (pp. 249–265). Springer.
DOI: https://doi.org/10.1007/978-981-16-3071-2_22   Google Scholar

Fayaz, S. A., Zaman, M., Kaul, S., & Butt, M. A. (2022). Is Deep Learning on Tabular Data Enough? An Assessment. International Journal of Advanced Computer Science and Applications, 13(4), 2022. http://dx.doi.org/10.14569/IJACSA.2022.0130454
DOI: https://doi.org/10.14569/IJACSA.2022.0130454   Google Scholar

Kaul, S., Fayaz, S. A., Zaman, M., & Butt, M. A. (2022). Is decision tree obsolete in its original form? A burning debate. Revue d'Intelligence Artificielle, 36(1), 105–113.
DOI: https://doi.org/10.18280/ria.360112   Google Scholar

Kaul, S., Zaman, M., Fayaz, S. A., & Butt, M. A. (2023). Performance Stagnation of Meteorological Data of Kashmir. In International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems (vol. 471). Springer. https://doi.org/10.1007/978-981-19-2535-1_63
DOI: https://doi.org/10.1007/978-981-19-2535-1_63   Google Scholar

Mohd, R., Butt, M. A., & Baba, M. Z. (2020). GWLM–NARX: grey wolf levenberg–marquardt-based neural network for rainfall prediction. Data Technologies and Applications, 54(1), 85–102. https://doi.org/10.1108/DTA-08-2019-0130. 2020.
DOI: https://doi.org/10.1108/DTA-08-2019-0130   Google Scholar

Mohd, R., Butt, M. A., & Baba, M. Z. (2022). Grey Wolf-Based Linear Regression Model for Rainfall Prediction. International Journal of Information Technologies and Systems Approach, 15(1), 1-18.
DOI: https://doi.org/10.4018/IJITSA.290004   Google Scholar

Niu, J., & Zhang, W. (2015). Comparative analysis of statistical models in rainfall prediction. In 2015 IEEE International Conference on Information and Automation (pp. 2187-2190). IEEE.
DOI: https://doi.org/10.1109/ICInfA.2015.7279650   Google Scholar

Pucheta, J. A., Cristian, M. R. R., Martín, R. H., Carlos, A. S., Patiño, H. D., & Benjamín, R. K. (2009). A feedforward neural networks-based nonlinear autoregressive model for forecasting time series. Comput y Sistemas, 14(4), 423–435.
  Google Scholar

Rezaie-balf, M., Naganna, S. R., Ghaemi, A., & Deka, P. C. (2017). Wavelet coupled MARS and M5 Model Tree approaches for groundwater level forecasting. Journal of hydrology, 553, 356–373.
DOI: https://doi.org/10.1016/j.jhydrol.2017.08.006   Google Scholar

Singh, P., & Borah, B. (2013). Indian summer monsoon rainfall prediction using artificial neural network. Stochastic Environmental Research and Risk Assessment, 27(7), 1585–1599.
DOI: https://doi.org/10.1007/s00477-013-0695-0   Google Scholar

Singh, U., Chauhan, S., Krishnamachari, A., & Vig, L. (2015). Ensemble of deep long short term memory networks for labelling origin of replication sequences. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA) (pp.1–7). IEEE. http://dx.doi.org/10.1109/DSAA.2015.7344871
DOI: https://doi.org/10.1109/DSAA.2015.7344871   Google Scholar

Wu, C., & Chau, K.-W. (2013). Prediction of rainfall time series using modular soft computing methods. Engineering Applications of Artificial Intelligence, 26(3), 997–1007. https://doi.org/10.1016/j.engappai.2012.05.023
DOI: https://doi.org/10.1016/j.engappai.2012.05.023   Google Scholar

Xiang, Y., Gou, L., He, L., Xia, S., & Wang, W. (2018). A SVR–ANN combined model based on ensemble EMD for rainfall prediction. Applied Soft Computing, 73, 874–883. https://doi.org/10.1016/j.asoc.2018.09.018
DOI: https://doi.org/10.1016/j.asoc.2018.09.018   Google Scholar

Yang, Y., Lin, H., Guo, Z., & Jiang, J. (2007). A data mining approach for heavy rainfall forecasting based on satellite image sequence analysis. Comput Geosci, 33(1), 20–30.
DOI: https://doi.org/10.1016/j.cageo.2006.05.010   Google Scholar

Zaman, M., & Butt, M. A. (2012). Information translation: a practitioners approach. In World Congress on Engineering and Computer Science (WCECS).
  Google Scholar

Zaz, S. N., Romshoo, S. A., Krishnamoorthy, R. T., & Viswanadhapalli, Y. (2019). Analyses of temperature and precipitation in the Indian Jammu and Kashmir region for the 1980–2016 period: implications for remote influence and extreme events. Atmospheric Chemistry and Physics, 19(1), 15-37. https://doi.org/10.5194/acp-19-15-2019
DOI: https://doi.org/10.5194/acp-19-15-2019   Google Scholar

Download


Published
2022-10-01

Cited by

FAYAZ, S. A. ., ZAMAN, M. ., BUTT, M. A. ., & KAUL, S. . (2022). HOW MACHINE LEARNING ALGORITHMS ARE USED IN METEOROLOGICAL DATA CLASSIFICATION: A COMPARATIVE APPROACH BETWEEN DT, LMT, M5-MT, GRADIENT BOOSTING AND GWLM-NARX MODELS. Applied Computer Science, 18(4), 16–27. https://doi.org/10.35784/acs-2022-26

Authors

Sheikh Amir FAYAZ 
skh.amir88@gmail.com
Department of Computer Sciences, University of Kashmir, J&K India

Authors

Majid ZAMAN 

Directorate of IT & SS, University of Kashmir, J&K India

Authors

Muheet Ahmed BUTT 

Department of Computer Sciences, University of Kashmir, J&K India

Authors

Sameer KAUL 

Department of Computer Sciences, University of Kashmir, J&K India

Statistics

Abstract views: 264
PDF downloads: 119


License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.


Most read articles by the same author(s)