Abstract
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee. Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarificational Exchanges (CE): a Clarification Request (CR) and a response. Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models. We use the SIMMC 2.0 dataset to evaluate the ability of different state-of-the-art model architectures to process CEs, with a metric that probes the contextual updates that arise from them in the model. We find that language-based models are able to encode simple multi-modal semantic information and process some CEs, excelling with those related to the dialogue history, whilst multi-modal models can use additional learning objectives to obtain disentangled object representations, which become crucial to handle complex referential ambiguities across modalities overall.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
| Publisher | Association for Computational Linguistics |
| Pages | 175–182 |
| Number of pages | 8 |
| ISBN (Electronic) | 9798891760288 |
| Publication status | Published - 11 Sept 2023 |
| Event | The 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue - OREA Hotel Pyramida, Prague, Czech Republic Duration: 11 Sept 2023 → 15 Sept 2023 Conference number: 24 https://sigdialinlg2023.github.io/index.html |
Conference
| Conference | The 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
|---|---|
| Abbreviated title | SIGDial 2023 |
| Country/Territory | Czech Republic |
| City | Prague |
| Period | 11/09/23 → 15/09/23 |
| Internet address |
Fingerprint
Dive into the research topics of ''What are you referring to?’ Evaluating the ability of multi-modal dialogue models to process clarificational exchanges'. Together they form a unique fingerprint.Prizes
-
Best Short Paper
Garcia, J. C. (Recipient), Suglia, A. (Recipient), Eshghi, A. (Recipient) & Hastie, H. (Recipient), 3 Oct 2023
Prize: Prize (including medals and awards)