DEVELOPING A PREDICTIVE MODEL FOR CLASSIFYING STUDENT’S ACADEMIC PERFORMANCE

SOURCE:

Faculty: Physical Sciences
Department: Computer Science

CONTRIBUTORS:

Ngene C. Chidimma
Inyiama H. C.

ABSTRACT:

Educational Data Mining is a leading area for high quality research that mines data sets to answer educational research questions that shed light on the learning process. It is a new trend in the data mining and Knowledge Discovery in Databases (KDD) field that focuses in mining useful patterns and discovering useful knowledge from the educational information systems. One of the application areas of Educational data mining is analysis and prediction of student’s academic performance. The vision of any higher educational institution is to improve the quality of managerial decisions and to impart quality of education. Good prediction of student’s success in higher learning institution is one way to reach the highest level of quality in higher education systems. The need to identify low performing students at the beginning of the learning process and offer academic advice will enhance their academic performance and, further improve the overall educational quality. Measuring academic performance of students is challenging since students academic performance hinges on diverse factors such as personal and academic related factors. This research explores multiple factors theoretically assumed to affect students’ performance in higher education, and find a qualitative model which best classifies and predicts postgraduate students’ performance based on related factors. Two existing techniques, K-Nearest Neighbour (KNN) and Naïve Bayes predicted students’ performance; but were however limited in the accuracy of predication, and a hybrid model K-Bay which is a combination of K-NN and Naïve Bayes was proposed. The dataset used for the analysis includes student’s attributes like academic grades, demographic attributes, work related attributes, social attributes and school related attributes. Questionnaires were used for collecting data from the students. Educational data mining technique and Object Oriented Analysis and Design Methodology (OOADM) and Knowledge Discovery in Database (KDD) were adopted. In building the classification model, student data set consisting of 499 different instances with 33 different attributes were implemented on the algorithms. Analysis and ranking of factors/attributes affecting students’ performance were achieved using Correlation Based Feature Selection (CFS) in which five highly influencing factors were selected. The results were evaluated and compared for better accuracy of prediction. Using all the attributes, the system realized an accuracy of 95.92% as against the single classifiers; Naïve Bayes and KNN which had an accuracy of 69.39% and 71.43/% respectively; Execution time for the new model was 0.134 seconds while KNN and Naive Bay was 0.357 and 0.18 seconds respectively. Using only the highly influencing attributes, the system realized an accuracy of 99% as against the single classifiers; Naïve Bayes and KNN which had an accuracy of 75.51%and 59.18% respectively. Hence K-Bay prediction model produced a more reliable and accurate system for students’ academic performance.