Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023;28(2):1427-1453.
doi: 10.1007/s10639-022-11259-2. Epub 2022 Jul 29.

Predicting students' performance in English and Mathematics using data mining techniques

Affiliations

Predicting students' performance in English and Mathematics using data mining techniques

Muhammad Haziq Bin Roslan et al. Educ Inf Technol (Dordr). 2023.

Abstract

This study attempts to predict secondary school students' performance in English and Mathematics subjects using data mining (DM) techniques. It aims to provide insights into predictors of students' performance in English and Mathematics, characteristics of students with different levels of performance, the most effective DM technique for students' performance prediction, and the relationship between these two subjects. The study employed the archival data of students who were 16 years old in 2019 and sat for the Malaysian Certificate of Examination (MCE) in 2021. The learning of English and Mathematics is a concern in many countries. Three main factors, namely students' past academic performance, demographics, and psychological attributes were scrutinized to identify their impact on the prediction. This study utilized the Orange software for the DM process. It employed Decision Tree (DT) rules to determine the characteristics of students with low, moderate, and high performance in English and Mathematics subjects. DT and Naïve Bayes (NB) techniques show the best predictive performance for English and Mathematics subjects, respectively. Such characteristics and predictions may cue appropriate interventions to improve students' performance in these subjects. This study revealed students' past academic performance as the most critical predictor, as well as a few demographics and psychological attributes. By examining top predictors derived using four different classifier types, this study found that students' past Mathematics performance predicts their MCE English performance and students' past English performance predicts their MCE Mathematics performance. This finding shows students' performances in both subjects are interrelated.

Keywords: Data mining techniques; Educational data mining; English; Mathematics; Performance prediction; Secondary education.

PubMed Disclaimer

Conflict of interest statement

Conflict of interestThere is no potential conflict of interest in this study.

Figures

Fig. 1
Fig. 1
Research framework
Fig. 2
Fig. 2
DM model development process
Fig. 3
Fig. 3
Predicting students’ performance process using DT
Fig. 4
Fig. 4
Tree diagram of students with low English performance
Fig. 5
Fig. 5
Tree diagram of students with moderate English performance
Fig. 6
Fig. 6
Tree diagram of students with high English performance
Fig. 7
Fig. 7
Tree diagram of students with low Mathematics performance
Fig. 8
Fig. 8
Tree diagram of students with moderate Mathematics performance
Fig. 9
Fig. 9
Tree diagram of students with high Mathematics performance
Fig. 10
Fig. 10
Process of identifying main predictors using DT in orange
Fig. 11
Fig. 11
Process of identifying main predictors using NN in orange
Fig. 12
Fig. 12
Process of identifying main predictors using SVM in orange
Fig. 13
Fig. 13
Process of identifying main predictors using NB in orange

Similar articles

Cited by

References

    1. Adekitan AI, Salau O. The impact of engineering students’ performance in the first three years on their graduation result using educational data mining. Heliyon. 2019;5(2):e01250. doi: 10.1016/j.heliyon.2019.e01250. - DOI - PMC - PubMed
    1. Ahuja, R., Chug, A., Gupta, S., Ahuja, P., & Kohli, S. (2020). Classification and clustering algorithms of machine learning with their applications. In Nature-Inspired Computation in Data Mining and Machine Learning (pp. 225–248). Springer, Cham. 10.1007/978-3-030-28553-1_11
    1. Algarni, A. (2016). Data mining in education. International Journal of Advanced Computer Science and Applications, 7. 10.14569/IJACSA.2016.070659
    1. Almeda, M. V., Zuech, J., Utz, C., Higgins, G., Reynolds, R., & Baker, R. S. (2018). Comparing the factors that predict completion and grades among for-credit and open/mooc students in online learning. Online Learning Journal, 22(1), 1–18. 10.24059/olj.v22i1.1060
    1. Alyahyan E, Düştegör D. Predicting academic success in higher education: Literature review and best practices. International Journal of Educational Technology in Higher Education. 2020;17(1):3. doi: 10.1186/s41239-020-0177-7. - DOI

LinkOut - more resources