Abstract / Description of output
Entity-oriented search systems often learn vector representations of entities via the introductory paragraph from the Wikipedia page of the entity. As such representations are the same for every query, our hypothesis is that the representations are not ideal for IR tasks. In this work, we present BERT Entity Representations (BERT-ER) which are query-specific vector representations of entities obtained from text that describes how an entity is relevant for a query. Using BERT-ER in a downstream entity ranking system, we achieve a performance improvement of 13-42% (Mean Average Precision) over a system that uses the BERT embedding of the introductory paragraph from Wikipedia on two large-scale test collections. Our approach also outperforms entity ranking systems using entity embeddings from Wikipedia2Vec, ERNIE, and E-BERT. We show that our entity ranking system using BERT-ER can increase precision at the top of the ranking by promoting relevant entities to the top. With this work, we release our BERT models and query-specific entity embeddings fine-tuned for the entity ranking task.
Original language | English |
---|---|
Title of host publication | SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Publisher | Association for Computing Machinery, Inc |
Pages | 1466-1477 |
Number of pages | 12 |
ISBN (Electronic) | 9781450387323 |
DOIs | |
Publication status | Published - 6 Jul 2022 |
Event | 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 - Madrid, Spain Duration: 11 Jul 2022 → 15 Jul 2022 |
Conference
Conference | 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022 |
---|---|
Country/Territory | Spain |
City | Madrid |
Period | 11/07/22 → 15/07/22 |
Keywords / Materials (for Non-textual outputs)
- bert
- entity ranking
- query-specific entity representations