Abstract / Description of output
Misinformation detection models degrade in performance over time, but the precise causes of this remain under-researched, in particular for multimodal models. We present experiments investigating the impact of temporal shift on performance of multimodal automatic misinformation detection classifiers. Working with the r/Fakeddit dataset, we found that evaluating models on temporally out-of-domain data (i.e. data from time stretches unseen in training) results in a non-linear, 7-8% drop in macro F1 as compared to traditional evaluation strategies (which do not control for the effect of content change over time). Focusing on two factors that make temporal generalizability in misinformation detection difficult, content shift and class distribution shift, we found that content shift has a stronger effect on recall. Within the context of coarse-grained vs. fine-grained misinformation detection with r/Fakeddit, we find that certain misinformation classes seem to be more stable with respect to content shift (e.g. Manipulated and Misleading Content). Our results indicate that future research efforts need to explicitly account for the temporal nature of misinformation to ensure that experiments reflect expected real-world performance.
Original language | English |
---|---|
Title of host publication | Proceedings - The first workshop on (benchmarking) generalisation in NLP |
Publisher | Association for Computational Linguistics |
Pages | 76–88 |
Number of pages | 13 |
ISBN (Electronic) | 979-8-89176-042-4 |
DOIs | |
Publication status | Published - 6 Dec 2023 |
Event | The first workshop on (benchmarking) generalisation in NLP - Singapore, Singapore Duration: 6 Dec 2023 → … Conference number: 1 https://genbench.org/workshop/ |
Workshop
Workshop | The first workshop on (benchmarking) generalisation in NLP |
---|---|
Abbreviated title | GenBench 2023 |
Country/Territory | Singapore |
City | Singapore |
Period | 6/12/23 → … |
Internet address |