Abstract
Weakly supervised object localization (WSOL) aims to learn representations that encode object location using only image-level category labels. However, many objects can be labeled at different levels of granularity. Is it an animal, a bird, or a great horned owl? Which image-level labels should we use? In this paper we study the role of label granularity in WSOL. To facilitate this investigation we introduce iNatLoc500, a new large-scale fine-grained benchmark dataset for WSOL. Surprisingly, we find that choosing the right training label granularity provides a much larger performance boost than choosing the best WSOL algorithm. We also show that changing the label granularity can significantly improve data efficiency.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X |
Editors | Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner |
Publisher | Springer |
Pages | 604-620 |
Number of pages | 33 |
ISBN (Electronic) | 978-3-031-20080-9 |
ISBN (Print) | 978-3-031-20079-3 |
DOIs | |
Publication status | Published - 3 Nov 2022 |
Event | European Conference on Computer Vision 2022 - Israel, Tel Aviv, Israel Duration: 23 Oct 2022 → 27 Oct 2022 https://eccv2022.ecva.net/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer Cham |
Volume | 13670 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision 2022 |
---|---|
Abbreviated title | ECCV 2022 |
Country/Territory | Israel |
City | Tel Aviv |
Period | 23/10/22 → 27/10/22 |
Internet address |