ARDP: SIMPLIFIED MACHINE LEARNING PREDICTOR FOR MISSING UNIDIMENSIONAL ACADEMIC RESULTS DATASET
Olufemi Folorunso
olufemifolorunso@gmail.comElizade University, Ilara Mokin (Nigeria)
https://orcid.org/0000-0002-0242-9316
Olufemi Akinyede
Federal University of Technology, Akure, Nigeria (Nigeria)
Kehinde Agbele
Elizade University, Ilara Mokin (Nigeria)
Abstract
We present a machine learning predictor for academic results datasets (PARD), for missing academic results based on chi-squared expected calculation, positional clustering, progressive approximation of relative residuals, and positional averages of the data in a sampled population. Academic results datasets are data originating from academic institutions’ results repositories. It is a technique designed specifically for predicting missing academic results. Since the whole essence of data mining is to elicit useful information and gain knowledge-driven insights into datasets, PARD positions data explorer at this advantageous perspective. PARD promises to solve missing academic results dataset problems more quickly over and above what currently obtains in literatures. The predictor was implemented using Python, and the results obtained show that it is admissible in a minimum of up to 93.6 average percent accurate predictions of the sampled cases. The results demonstrate that PARD shows a tendency toward greater precision in providing the better solution to the problems of predictions of missing academic results datasets in universities.
Keywords:
missingness, predictor variable, training dataset, heuristics, unidimensionalityReferences
Andrew W. Brown, Kathryn A. Kaiser and David B. Allison.(2018). Issues with data and analyses: Errors, underlying themes, and potential solutions. PNAS Vol.115, no 11. March, 2018. https://doi.org/10.1073/pnas.1708279115.
DOI: https://doi.org/10.1073/pnas.1708279115
Google Scholar
Anupama Kumar S. and Dr. Vijayalakshmi M.N. (2011).Efficiency of decision trees in predicting student's academic performance. D.C. Wyld, et al. (Eds.): CCSEA 2011, CS & IT 02, pp. 335–343, 2011. DOI: 10.5121/csit.2011.1230.
Google Scholar
Arkopal Choudhury and Michael R. Kosorok, (2020), Missing Data Imputation for Classification Problems. Deep Artificial Intelligence. Statistics > Machine Learning. https://deepai.org/publication/missing-data-imputation-for-classification-problems.
Google Scholar
Baepler P, Murdoch CJ. (2010).Academic analytics and data mining in higher education. International Journal for the Scholarship of Teaching & Learning 2010, 4:1–9.
DOI: https://doi.org/10.20429/ijsotl.2010.040217
Google Scholar
Baker RSJd, Yacef K. The state of educational data mining in (2009): a review and future visions. Journal of Educational Data Mining 2009, 3–17.
Google Scholar
Baker RSJd. (2010). Data mining for education. In McGaw B, Peterson P, Baker E, eds. International Encyclopedia of Education. 3rd ed. Vol. 7. Oxford, UK: Elsevier; 2010, 112–118.
DOI: https://doi.org/10.1016/B978-0-08-044894-7.01318-X
Google Scholar
Bala M, Ojha DB.(2012). Study of applications of data mining techniques in education. International Journal of Research in Science and Technology 2012.
Google Scholar
Bernardo Breve, Loredana Caruccio, Vincenzo Deufemia, and Giuseppe Polese.(2022). RENUVER: A Missing Value Imputation Algorithm based on Relaxed Functional Dependencies. Proceedings of the 25th International Conference on Extending Database Technology (EDBT), 29th March-1st April, 2022.
Google Scholar
Castro F, Vellido A, Nebot A, Mugica F. (2007). Applying data mining techniques to e-learning problems. In: Evolution of Teaching and Learning Paradigms in Intelligent Environment. Studies in Computational Intelligence. Vol. 62. Berlin, Germany: Springer-Verlag; 2007, 183– 221.
DOI: https://doi.org/10.1007/978-3-540-71974-8_8
Google Scholar
Cristobal Romero and Sebastian Ventura.(2013). Wiley Interdisciplinary Reviews: . Data Mining Knowledge Discovery. Data Mining in Education. 2013, 3: 12–27 doi: 10.1002/widm.1075.
DOI: https://doi.org/10.1002/widm.1075
Google Scholar
Fotios Petropoulos, Daniele Apiletti, Vassilios Assimakopoulos, Mohamed Zied Babai, Devon K. Barrow, et al. (2022), Forecasting: theory and practice, International Journal of Forecasting, Volume 38, Issue 3, 2022,Pages 705-871,ISSN 0169-2070, https://doi.org/10.1016/j.ijforecast.2021.11.001.
DOI: https://doi.org/10.1016/j.ijforecast.2021.11.001
Google Scholar
Gustavo Batista and Maria Carolina Monard. (2003). An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17, 5-6 (2003), 519–533. https://www.educba.com/data-mining-tool/ accesses on 1/12/2022 7.53am GMT.
DOI: https://doi.org/10.1080/713827181
Google Scholar
https://www.freecodecamp.org/news/author/ibrahim/. Accessed on 16th January, 2023.
Google Scholar
Irene Pasina, Goze Bayram, Wafa Labib, Abdelhakim Abdelhadi and Mohammad Nurunnabi.(2019) Clustering students into groups according to their learning style. MethodsX, Volume 6, 2019, Pages 2189-2197
DOI: https://doi.org/10.1016/j.mex.2019.09.026
Google Scholar
Jolani S., Debray TP., Koffijberg H., van Buuren S.,& Moons KG.(2015). Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Stat Med. 2015 May 20;34(11):1841-63. doi: 10.1002/sim.6451. Epub 2015 Feb 9. PMID: 25663182.
DOI: https://doi.org/10.1002/sim.6451
Google Scholar
Koedinger K, Cunningham K, Skogsholm A, Leber B.(2008). An open repository and analysis tools for finegrained, longitudinal learner data. In: First International Conference on Educational Data Mining. Montreal, Canada; 2008, 157–166.
Google Scholar
Luke Oluwaseye Joel , Wesley Doorsamy, Babu Sena Paul. (2022). A review of Missing Data Handling Techniques for Machine Learning. International Journal of Innovative Technology and Interdisciplinary sciences, Volume 5, Issue 3, July, 2022.
Google Scholar
Marian Bucos and Bogdan Drăgulescu, (2018), Predicting Student Success Using Data Generated in Traditional Educational Environments. TEM Journal. Volume 7, Issue 3, Pages 617-625, ISSN 2217-8309, DOI: 10.18421/TEM73-19, TEM Journal – Volume 7.
Google Scholar
Merceron A, Yacef K.(2004). Mining student data captured from a web-based tutoring tool: initial exploration and results. Journal of Interactive Learning Research. 2004, 15:319–346.
Google Scholar
McCalla G.(2004). The ecological approach to the design of elearning environments: purpose-based capture and use of information about learners. Journal of Interactive Media Education 2004, 7:1–23.
DOI: https://doi.org/10.5334/2004-7-mccalla
Google Scholar
Mostow J, Beck J(2006). Some useful tactics to modify, map and mine data from intelligent tutors. Journal of Natural Language Engineering 2006, 12:195–208.
DOI: https://doi.org/10.1017/S1351324906004153
Google Scholar
Mzahir A. S. Abugroon, (2018). Comparison of Educational Datamining algorithms for Supporting the Decision in Sudanese Higher Education Institutions. GCNU Journal ISSN:1858-6228, July 2018.
Google Scholar
Nadimi-Shahraki, Mohammad H., Saeed Mohammadi, Hoda Zamani, Mostafa Gandomi, and Amir H. Gandomi. 2021. "A Hybrid Imputation Method for Multi-Pattern Missing Data: A Case Study on Type II Diabetes Diagnosis" Electronics 10, no. 24: 3167. https://doi.org/10.3390/electronics10243167
DOI: https://doi.org/10.3390/electronics10243167
Google Scholar
Omri Ben-Shahar.(2019). Data Pollution. Journal of Legal Analysis Volume 11, 2019, Pages 104–159, https://doi.org/10.1093/jla/laz005.
DOI: https://doi.org/10.1093/jla/laz005
Google Scholar
Orlando Bisacchi and CoelhoIsmar Silveira, (2017), Deep Learning applied to Learning Analytics and Educational Data Mining: A Systematic Literature Review. Conference: XXVIII Simpósio Brasileiro de Informática na Educação - SBIE (Brazilian Symposium on Computers in Education). DOI: 10.5753/cbie.sbie.2017.143.
DOI: https://doi.org/10.5753/cbie.sbie.2017.143
Google Scholar
Rogier A. Donders T., Geert JMG Van Der Heijden, Theo Stijnen, and Karel GM Moons. (2006). A gentle introduction to imputation of missing values. Journal of clinical epidemiology 59, 10 (2006), 1087–1091.
DOI: https://doi.org/10.1016/j.jclinepi.2006.01.014
Google Scholar
Romero C, Ventura S.(2006). Data Mining in E-learning. Southampton, UK: Wit-Press; 2006.
DOI: https://doi.org/10.2495/1-84564-152-3
Google Scholar
Sebastian Daberdaku, Erica Tavazzi and Barbara Di Camillo, (2020) Combined Interpolation and Weighted K Nearest Neighbours Approach for the Imputation of Longitudinal ICU Laboratory Data. Journal of Healthcare Informatics Research 4(3). DOI: 10.1007/s41666-020-00069-1. Springer
DOI: https://doi.org/10.1007/s41666-020-00069-1
Google Scholar
Siemens G, Baker RSJd. (2012). Learning analytics and educational data mining: towards communication and collaboration. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge. Vancouver, British Columbia, Canada; 2012, 1–3.
DOI: https://doi.org/10.1145/2330601.2330661
Google Scholar
Tengfei Wang, Baorong Xiao & Weixiao Ma.(2022). Student Behavior Data Analysis Based on Association Rule Mining. International Journal of Computational Intelligence Systems. Article number: 32 (2022).
DOI: https://doi.org/10.1007/s44196-022-00087-4
Google Scholar
Ugo Fiore. (2019). Neural Networks in the Educational Sector: Challenges and Opportunities. 9th Balkan Region Conference on Engineering and Business Educationand12th International Conference on Engineering and Business Education. (Oct., 2019). DOI: 10.2478/cplbu-2020-0039.
DOI: https://doi.org/10.2478/cplbu-2020-0039
Google Scholar
Zhou, Dehui, (2021), Financial Market Prediction and Simulation Based on the FEPA Model. Journal of Mathematics, Hindawi. https://doi.org/10.1155/2021/5955375 10.1155/2021/5955375
DOI: https://doi.org/10.1155/2021/5955375
Google Scholar
Authors
Olufemi Folorunsoolufemifolorunso@gmail.com
Elizade University, Ilara Mokin Nigeria
https://orcid.org/0000-0002-0242-9316
Authors
Olufemi AkinyedeFederal University of Technology, Akure, Nigeria Nigeria
Authors
Kehinde AgbeleElizade University, Ilara Mokin Nigeria
Statistics
Abstract views: 180PDF downloads: 104
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Wulan Dewi, Wiranto Herry Utomo, PLANT CLASSIFICATION BASED ON LEAF EDGES AND LEAF MORPHOLOGICAL VEINS USING WAVELET CONVOLUTIONAL NEURAL NETWORK , Applied Computer Science: Vol. 17 No. 1 (2021)
You may also start an advanced similarity search for this article.