Abstract
Deep learning has achieved impressive results in camera localization, but current single-image techniques typically suffer from a lack of robustness, leading to large outliers. To some extent, this has been tackled by sequential (multi-images) or geometry constraint approaches, which can learn to reject dynamic objects and illumination conditions to achieve better performance. In this work, we show that attention can be used to force the network to focus on more geometrically robust objects and features, achieving state-of-the-art performance in common benchmark, even if using only a single image as input. Extensive experimental evidence is provided through public indoor and outdoor datasets. Through visualization of the saliency maps, we demonstrate how the network learns to reject dynamic objects, yielding superior global camera pose regression performance. The source code is avaliable at https://github.com/BingCS/AtLoc.
Original language | English |
---|---|
Title of host publication | Proceedings of the AAAI Conference on Artificial Intelligence 2020 |
Publisher | AAAI Press |
Pages | 10393-10401 |
Number of pages | 9 |
ISBN (Electronic) | 978-1-57735-835-0 |
DOIs | |
Publication status | Published - 3 Apr 2020 |
Event | 34th AAAI Conference on Artificial Intelligence - New York, United States Duration: 7 Feb 2020 → 12 Feb 2020 Conference number: 34 https://aaai.org/Conferences/AAAI-19/ |
Publication series
Name | Proceedings of the AAAI Conference on Artificial Intelligence |
---|---|
Publisher | AAAI Press |
Number | 6 |
Volume | 34 |
ISSN (Print) | 2159-5399 |
ISSN (Electronic) | 2374-3468 |
Conference
Conference | 34th AAAI Conference on Artificial Intelligence |
---|---|
Abbreviated title | AAAI 2020 |
Country/Territory | United States |
City | New York |
Period | 7/02/20 → 12/02/20 |
Internet address |