Abstract / Description of output
The widespread use of online discussion forums in educational settings of all kinds provides a rich source of data for researchers interested in how collaboration and interaction can foster effective learning. Online behaviour can be understood through the Community of Inquiry framework, and the cognitive presence construct in particular can be used to characterise the depth of a student’s critical engagement with the course material. Automated methods have been developed to support this task, but many studies used very small data sets, and there have been few replication studies. Furthermore, some of the classification features that were used in prior work depended on an external knowledge base, limiting their applicability in other domains and language contexts.
In this work, we present findings related to the robustness and generalisability of automated classification methods for detecting cognitive presence in discussion forum transcripts. We closely examined one published state-of-the-art model, comparing different approaches to managing unbalanced classes. We derived new classification features using natural language processing techniques. The explanatory power of the individual features was analysed and compared with prior work.
By demonstrating how commonly-used data preprocessing practices can lead to over-optimistic results, we contribute to the development of the field so that the results of automated content analysis can be used with confidence. We also show that topic modelling can be used to generate new features with explanatory power.
In this work, we present findings related to the robustness and generalisability of automated classification methods for detecting cognitive presence in discussion forum transcripts. We closely examined one published state-of-the-art model, comparing different approaches to managing unbalanced classes. We derived new classification features using natural language processing techniques. The explanatory power of the individual features was analysed and compared with prior work.
By demonstrating how commonly-used data preprocessing practices can lead to over-optimistic results, we contribute to the development of the field so that the results of automated content analysis can be used with confidence. We also show that topic modelling can be used to generate new features with explanatory power.
Original language | English |
---|---|
Awarding Institution |
|
Supervisors/Advisors |
|
Publication status | Published - 2018 |