Abstract / Description of output
Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee. Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarificational Exchanges (CE): a Clarification Request (CR) and a response. Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models. We use the SIMMC 2.0 dataset to evaluate the ability of different state-of-the-art model architectures to process CEs, with a metric that probes the contextual updates that arise from them in the model. We find that language-based models are able to encode simple multi-modal semantic information and process some CEs, excelling with those related to the dialogue history, whilst multi-modal models can use additional learning objectives to obtain disentangled object representations, which become crucial to handle complex referential ambiguities across modalities overall.
Original language | English |
---|---|
Title of host publication | Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
Publisher | Association for Computational Linguistics |
Pages | 175–182 |
Number of pages | 8 |
ISBN (Electronic) | 9798891760288 |
Publication status | Published - 11 Sept 2023 |
Event | The 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue - OREA Hotel Pyramida, Prague, Czech Republic Duration: 11 Sept 2023 → 15 Sept 2023 Conference number: 24 https://sigdialinlg2023.github.io/index.html |
Conference
Conference | The 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
---|---|
Abbreviated title | SIGDial 2023 |
Country/Territory | Czech Republic |
City | Prague |
Period | 11/09/23 → 15/09/23 |
Internet address |
Fingerprint
Dive into the research topics of ''What are you referring to?’ Evaluating the ability of multi-modal dialogue models to process clarificational exchanges'. Together they form a unique fingerprint.Prizes
-
Best Short Paper
Garcia, Javier Chiyah (Recipient), Suglia, Alessandro (Recipient), Eshghi, Arash (Recipient) & Hastie, Helen (Recipient), 3 Oct 2023
Prize: Prize (including medals and awards)