VL-Fields: Towards Language-Grounded Neural Implicit Spatial Representations

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We present Visual-Language Fields (VL-Fields), a neural implicit spatial representation that enables open-vocabulary semantic queries. Our model encodes and fuses the geometry of a scene with vision-language trained latent features by distilling information from a language-driven segmentation model. VL-Fields is trained without requiring any prior knowledge of the scene object classes, which makes it a promising representation for the field of robotics. Our model outperformed the similar CLIP-Fields model in the task of semantic segmentation by almost 10%.
Original languageEnglish
Title of host publicationWorkshop on Representations, Abstractions, and Priors for Robot Learning Workshop at International Conference on Robotics and Automation 2023
Pages1-6
Number of pages6
Publication statusPublished - 29 May 2023
Event2023 IEEE International Conference on Robotics and Automation - London, United Kingdom
Duration: 29 May 20232 Jun 2023
https://www.icra2023.org

Conference

Conference2023 IEEE International Conference on Robotics and Automation
Abbreviated titleICRA 2023
Country/TerritoryUnited Kingdom
CityLondon
Period29/05/232/06/23
Internet address

Fingerprint

Dive into the research topics of 'VL-Fields: Towards Language-Grounded Neural Implicit Spatial Representations'. Together they form a unique fingerprint.

Cite this