Projects per year
Abstract
Precisely naming the action depicted in a video can be a challenging and oftentimes ambiguous task. In contrast to object instances represented as nouns (e.g. dog, cat, chair, etc.), in the case of actions, human annotators typically lack a consensus as to what constitutes a specific action (e.g. jogging versus running). In practice, a given video can contain multiple valid positive annotations for the same action. As a result, video datasets often contain significant levels of label noise and overlap between the atomic action classes. In this work, we address the challenge of training multi-label action recognition models from only single positive training labels. We propose two approaches that are based on generating pseudo training examples sampled from similar instances within the train set. Unlike other approaches that use model-derived pseudo-labels, our pseudo-labels come from human annotations and are selected based on feature similarity. To validate our approaches, we create a new evaluation benchmark by manually annotating a subset of EPIC-Kitchens-100's validation set with multiple verb labels. We present results on this new test set along with additional results on a new version of HMDB-51, called Confusing-HMDB-102, where we outperform existing methods in both cases.
Data and code are available at https://github.com/kiyoon/verb_ambiguity
Data and code are available at https://github.com/kiyoon/verb_ambiguity
Original language | English |
---|---|
Title of host publication | Proceedings of The 33rd British Machine Vision Conference (BMVC 2022) |
Publisher | BMVA Press |
Number of pages | 18 |
Publication status | Published - 25 Nov 2022 |
Event | The 33rd British Machine Vision Conference, 2022 - London, United Kingdom Duration: 21 Nov 2022 → 24 Nov 2022 Conference number: 33 https://www.bmvc2022.org/ |
Conference
Conference | The 33rd British Machine Vision Conference, 2022 |
---|---|
Abbreviated title | BMVC 2022 |
Country/Territory | United Kingdom |
City | London |
Period | 21/11/22 → 24/11/22 |
Internet address |
Fingerprint
Dive into the research topics of 'An Action Is Worth Multiple Words: Handling Ambiguity in Action Recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Enabling advanced autonomy through human-AI collaboration
Fisher, B., Bilen, H., Keller, F., Lascarides, A., Mac Aodha, O., Mollica, F., N, S., Ramamoorthy, R., Rovatsos, M. & Sevilla-Lara, L.
1/10/21 → 30/06/22
Project: Research