Mining students’ data to identify at-risk students in an academic English course: A comparison of two classification techniques from a language teacher and statistical perspective

Abstract

In this study, mixed methods were used to explore the effectiveness of data-mining techniques from statistical and language teacher perspectives. This study is important, because comparison of data-mining techniques has seldom been conducted in the higher education language- learning context. In addition, not many previous comparison studies considered the perspective of language teachers, as the ultimate user of the results. This study used a data set with more than 5,000 students from two academic courses offered at a university in Hong Kong, and adopted two commonly used data-mining techniques: classification tree and logistics regression analysis. This quantitative analysis explored the suitability of data-mining techniques. To understand the language teacher perspective of these techniques, the results were presented to a group of 16 professional English teachers, to check whether they thought the results were useful. Results showed that despite satisfactory results in both data-mining techniques, the teachers were very hesitant to use them. The teachers’ resistance stemmed from their doubts about the techniques, and the applicability of these techniques in the language education context. Further research should be conducted to promote these techniques to language teachers.

pdf

Copyright of articles is retained by authors and CALL-EJ. As CALL-EJ is an open-access journal, articles are free to use, with proper attribution, in educational and other non-commercial settings. Sources must be acknowledged appropriately.