ARDP: SIMPLIFIED MACHINE LEARNING PREDICTOR FOR MISSING UNIDIMENSIONAL ACADEMIC RESULTS DATASET

Olufemi Folorunso

olufemifolorunso@gmail.com
Elizade University, Ilara Mokin (Nigeria)
https://orcid.org/0000-0002-0242-9316

Olufemi Akinyede


Federal University of Technology, Akure, Nigeria (Nigeria)

Kehinde Agbele


Elizade University, Ilara Mokin (Nigeria)

Abstract

We present a machine learning predictor for academic results datasets (PARD), for missing academic results based on chi-squared expected calculation, positional clustering, progressive approximation of relative residuals, and positional averages of the data in a sampled population. Academic results datasets are data originating from academic institutions’ results repositories. It is a technique designed specifically for predicting missing academic results. Since the whole essence of data mining is to elicit useful information and gain knowledge-driven insights into datasets, PARD positions data explorer at this advantageous perspective. PARD promises to solve missing academic results dataset problems more quickly over and above what currently obtains in literatures. The predictor was implemented using Python, and the results obtained show that it is admissible in a minimum of up to 93.6 average percent accurate predictions of the sampled cases. The results demonstrate that PARD shows a tendency toward greater precision in providing the better solution to the problems of predictions of missing academic results datasets in universities.


Keywords:

missingness, predictor variable, training dataset, heuristics, unidimensionality

Andrew W. Brown, Kathryn A. Kaiser and David B. Allison.(2018). Issues with data and analyses: Errors, underlying themes, and potential solutions. PNAS Vol.115, no 11. March, 2018. https://doi.org/10.1073/pnas.1708279115.
DOI: https://doi.org/10.1073/pnas.1708279115   Google Scholar

Anupama Kumar S. and Dr. Vijayalakshmi M.N. (2011).Efficiency of decision trees in predicting student's academic performance. D.C. Wyld, et al. (Eds.): CCSEA 2011, CS & IT 02, pp. 335–343, 2011. DOI: 10.5121/csit.2011.1230.
  Google Scholar

Arkopal Choudhury and Michael R. Kosorok, (2020), Missing Data Imputation for Classification Problems. Deep Artificial Intelligence. Statistics > Machine Learning. https://deepai.org/publication/missing-data-imputation-for-classification-problems.
  Google Scholar

Baepler P, Murdoch CJ. (2010).Academic analytics and data mining in higher education. International Journal for the Scholarship of Teaching & Learning 2010, 4:1–9.
DOI: https://doi.org/10.20429/ijsotl.2010.040217   Google Scholar

Baker RSJd, Yacef K. The state of educational data mining in (2009): a review and future visions. Journal of Educational Data Mining 2009, 3–17.
  Google Scholar

Baker RSJd. (2010). Data mining for education. In McGaw B, Peterson P, Baker E, eds. International Encyclopedia of Education. 3rd ed. Vol. 7. Oxford, UK: Elsevier; 2010, 112–118.
DOI: https://doi.org/10.1016/B978-0-08-044894-7.01318-X   Google Scholar

Bala M, Ojha DB.(2012). Study of applications of data mining techniques in education. International Journal of Research in Science and Technology 2012.
  Google Scholar

Bernardo Breve, Loredana Caruccio, Vincenzo Deufemia, and Giuseppe Polese.(2022). RENUVER: A Missing Value Imputation Algorithm based on Relaxed Functional Dependencies. Proceedings of the 25th International Conference on Extending Database Technology (EDBT), 29th March-1st April, 2022.
  Google Scholar

Castro F, Vellido A, Nebot A, Mugica F. (2007). Applying data mining techniques to e-learning problems. In: Evolution of Teaching and Learning Paradigms in Intelligent Environment. Studies in Computational Intelligence. Vol. 62. Berlin, Germany: Springer-Verlag; 2007, 183– 221.
DOI: https://doi.org/10.1007/978-3-540-71974-8_8   Google Scholar

Cristobal Romero and Sebastian Ventura.(2013). Wiley Interdisciplinary Reviews: . Data Mining Knowledge Discovery. Data Mining in Education. 2013, 3: 12–27 doi: 10.1002/widm.1075.
DOI: https://doi.org/10.1002/widm.1075   Google Scholar

Fotios Petropoulos, Daniele Apiletti, Vassilios Assimakopoulos, Mohamed Zied Babai, Devon K. Barrow, et al. (2022), Forecasting: theory and practice, International Journal of Forecasting, Volume 38, Issue 3, 2022,Pages 705-871,ISSN 0169-2070, https://doi.org/10.1016/j.ijforecast.2021.11.001.
DOI: https://doi.org/10.1016/j.ijforecast.2021.11.001   Google Scholar

Gustavo Batista and Maria Carolina Monard. (2003). An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17, 5-6 (2003), 519–533. https://www.educba.com/data-mining-tool/ accesses on 1/12/2022 7.53am GMT.
DOI: https://doi.org/10.1080/713827181   Google Scholar

https://www.freecodecamp.org/news/author/ibrahim/. Accessed on 16th January, 2023.
  Google Scholar

Irene Pasina, Goze Bayram, Wafa Labib, Abdelhakim Abdelhadi and Mohammad Nurunnabi.(2019) Clustering students into groups according to their learning style. MethodsX, Volume 6, 2019, Pages 2189-2197
DOI: https://doi.org/10.1016/j.mex.2019.09.026   Google Scholar

Jolani S., Debray TP., Koffijberg H., van Buuren S.,& Moons KG.(2015). Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Stat Med. 2015 May 20;34(11):1841-63. doi: 10.1002/sim.6451. Epub 2015 Feb 9. PMID: 25663182.
DOI: https://doi.org/10.1002/sim.6451   Google Scholar

Koedinger K, Cunningham K, Skogsholm A, Leber B.(2008). An open repository and analysis tools for finegrained, longitudinal learner data. In: First International Conference on Educational Data Mining. Montreal, Canada; 2008, 157–166.
  Google Scholar

Luke Oluwaseye Joel , Wesley Doorsamy, Babu Sena Paul. (2022). A review of Missing Data Handling Techniques for Machine Learning. International Journal of Innovative Technology and Interdisciplinary sciences, Volume 5, Issue 3, July, 2022.
  Google Scholar

Marian Bucos and Bogdan Drăgulescu, (2018), Predicting Student Success Using Data Generated in Traditional Educational Environments. TEM Journal. Volume 7, Issue 3, Pages 617-625, ISSN 2217-8309, DOI: 10.18421/TEM73-19, TEM Journal – Volume 7.
  Google Scholar

Merceron A, Yacef K.(2004). Mining student data captured from a web-based tutoring tool: initial exploration and results. Journal of Interactive Learning Research. 2004, 15:319–346.
  Google Scholar

McCalla G.(2004). The ecological approach to the design of elearning environments: purpose-based capture and use of information about learners. Journal of Interactive Media Education 2004, 7:1–23.
DOI: https://doi.org/10.5334/2004-7-mccalla   Google Scholar

Mostow J, Beck J(2006). Some useful tactics to modify, map and mine data from intelligent tutors. Journal of Natural Language Engineering 2006, 12:195–208.
DOI: https://doi.org/10.1017/S1351324906004153   Google Scholar

Mzahir A. S. Abugroon, (2018). Comparison of Educational Datamining algorithms for Supporting the Decision in Sudanese Higher Education Institutions. GCNU Journal ISSN:1858-6228, July 2018.
  Google Scholar

Nadimi-Shahraki, Mohammad H., Saeed Mohammadi, Hoda Zamani, Mostafa Gandomi, and Amir H. Gandomi. 2021. "A Hybrid Imputation Method for Multi-Pattern Missing Data: A Case Study on Type II Diabetes Diagnosis" Electronics 10, no. 24: 3167. https://doi.org/10.3390/electronics10243167
DOI: https://doi.org/10.3390/electronics10243167   Google Scholar

Omri Ben-Shahar.(2019). Data Pollution. Journal of Legal Analysis Volume 11, 2019, Pages 104–159, https://doi.org/10.1093/jla/laz005.
DOI: https://doi.org/10.1093/jla/laz005   Google Scholar

Orlando Bisacchi and CoelhoIsmar Silveira, (2017), Deep Learning applied to Learning Analytics and Educational Data Mining: A Systematic Literature Review. Conference: XXVIII Simpósio Brasileiro de Informática na Educação - SBIE (Brazilian Symposium on Computers in Education). DOI: 10.5753/cbie.sbie.2017.143.
DOI: https://doi.org/10.5753/cbie.sbie.2017.143   Google Scholar

Rogier A. Donders T., Geert JMG Van Der Heijden, Theo Stijnen, and Karel GM Moons. (2006). A gentle introduction to imputation of missing values. Journal of clinical epidemiology 59, 10 (2006), 1087–1091.
DOI: https://doi.org/10.1016/j.jclinepi.2006.01.014   Google Scholar

Romero C, Ventura S.(2006). Data Mining in E-learning. Southampton, UK: Wit-Press; 2006.
DOI: https://doi.org/10.2495/1-84564-152-3   Google Scholar

Sebastian Daberdaku, Erica Tavazzi and Barbara Di Camillo, (2020) Combined Interpolation and Weighted K Nearest Neighbours Approach for the Imputation of Longitudinal ICU Laboratory Data. Journal of Healthcare Informatics Research 4(3). DOI: 10.1007/s41666-020-00069-1. Springer
DOI: https://doi.org/10.1007/s41666-020-00069-1   Google Scholar

Siemens G, Baker RSJd. (2012). Learning analytics and educational data mining: towards communication and collaboration. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge. Vancouver, British Columbia, Canada; 2012, 1–3.
DOI: https://doi.org/10.1145/2330601.2330661   Google Scholar

Tengfei Wang, Baorong Xiao & Weixiao Ma.(2022). Student Behavior Data Analysis Based on Association Rule Mining. International Journal of Computational Intelligence Systems. Article number: 32 (2022).
DOI: https://doi.org/10.1007/s44196-022-00087-4   Google Scholar

Ugo Fiore. (2019). Neural Networks in the Educational Sector: Challenges and Opportunities. 9th Balkan Region Conference on Engineering and Business Educationand12th International Conference on Engineering and Business Education. (Oct., 2019). DOI: 10.2478/cplbu-2020-0039.
DOI: https://doi.org/10.2478/cplbu-2020-0039   Google Scholar

Zhou, Dehui, (2021), Financial Market Prediction and Simulation Based on the FEPA Model. Journal of Mathematics, Hindawi. https://doi.org/10.1155/2021/5955375 10.1155/2021/5955375
DOI: https://doi.org/10.1155/2021/5955375   Google Scholar

Download


Published
2023-03-31

Cited by

Folorunso, O., Akinyede, O., & Agbele, K. (2023). ARDP: SIMPLIFIED MACHINE LEARNING PREDICTOR FOR MISSING UNIDIMENSIONAL ACADEMIC RESULTS DATASET. Applied Computer Science, 19(1), 47–63. https://doi.org/10.35784/acs-2023-04

Authors

Olufemi Folorunso 
olufemifolorunso@gmail.com
Elizade University, Ilara Mokin Nigeria
https://orcid.org/0000-0002-0242-9316

Authors

Olufemi Akinyede 

Federal University of Technology, Akure, Nigeria Nigeria

Authors

Kehinde Agbele 

Elizade University, Ilara Mokin Nigeria

Statistics

Abstract views: 180
PDF downloads: 104


License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.