Edinburgh Research Explorer

Analysing discussion forum data: a replication study avoiding data contamination

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions



Original languageEnglish
Title of host publicationProceedings of the 9th International Learning Analytics & Knowledge Conference (LAK-19)
Place of PublicationTempe, Arizona, USA
Number of pages10
ISBN (Electronic)978-1-4503-6256-6
Publication statusE-pub ahead of print - 4 Mar 2019
Event9th International Learning Analystics & Knowledge Conference - Tempe, United States
Duration: 4 Mar 20198 Mar 2019


Conference9th International Learning Analystics & Knowledge Conference
Abbreviated titleLAK19
CountryUnited States
Internet address


The widespread use of online discussion forums in educational settings provides a rich source of data for researchers interested in how collaboration and interaction can foster effective learning. Such online behaviour can be understood through the Community of Inquiry framework, and the cognitive presence construct in particular can be used to characterise the depth of a student’s critical engagement with course material. Automated methods have been developed to support this task, but many studies used small data sets, and there have been few replication studies.
In this work, we present findings related to the robustness and generalisability of automated classification methods for detecting cognitive presence in discussion forum transcripts. We closely examined one published state-of-the-art model, comparing different approaches to managing unbalanced classes in the data. By demonstrating how commonly-used data preprocessing practices can lead to over-optimistic results, we contribute to the development of the field so that the results of automated content analysis can be used with confidence.


9th International Learning Analystics & Knowledge Conference


Tempe, United States

Event: Conference

Download statistics

No data available

ID: 77902653