TY - GEN
T1 - Prompting as Probing: Using Language Models for Knowledge Base Construction
AU - Alivanistos, Dimitrios
AU - Santamaría, Selene Báez
AU - Cochez, Michael
AU - Kalo, Jan Christoph
AU - van Krieken, Emile
AU - Thanapalasingam, Thiviyan
N1 - Funding Information: We thank Frank van Harmelen for his insightful comments. This research was funded by the Vrije Universiteit Amsterdam and the Netherlands Organisation for Scientific Research (NWO) via the Spinoza grant (SPI 63-260) awarded to Piek Vossen, the Hybrid Intelligence Centre via the Zwaartekracht grant (024.004.022), Elsevier’s Discovery Lab, and Huawei’s DReaMS Lab. Publisher Copyright: © 2022 Copyright for this paper by its authors.; 2022 Semantic Web Challenge on Knowledge Base Construction from Pre-Trained Language Models, LM-KBC 2022 ; Conference date: 01-10-2022
PY - 2022/11/16
Y1 - 2022/11/16
N2 - Language Models (LMs) have proven to be useful in various downstream applications, such as sum-marisation, translation, question answering and text classification. LMs are becoming increasingly important tools in Artificial Intelligence, because of the vast quantity of information they can store. In this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020, to perform the task of Knowledge Base Construction (KBC). ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this. Our results show that manual prompt curation is essential, that the LM must be encouraged to give answer sets of variable lengths, in particular including empty answer sets, that true/false questions are a useful device to increase precision on suggestions generated by the LM, that the size of the LM is a crucial factor, and that a dictionary of entity aliases improves the LM score. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions: ProP won track 2 of the LM-KBC competition, outperforming the baseline by 36.4 percentage points. Our implementation is available on https://github.com/HEmile/iswc-challenge.
AB - Language Models (LMs) have proven to be useful in various downstream applications, such as sum-marisation, translation, question answering and text classification. LMs are becoming increasingly important tools in Artificial Intelligence, because of the vast quantity of information they can store. In this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020, to perform the task of Knowledge Base Construction (KBC). ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this. Our results show that manual prompt curation is essential, that the LM must be encouraged to give answer sets of variable lengths, in particular including empty answer sets, that true/false questions are a useful device to increase precision on suggestions generated by the LM, that the size of the LM is a crucial factor, and that a dictionary of entity aliases improves the LM score. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions: ProP won track 2 of the LM-KBC competition, outperforming the baseline by 36.4 percentage points. Our implementation is available on https://github.com/HEmile/iswc-challenge.
UR - https://ceur-ws.org/Vol-3274/
M3 - Conference contribution
VL - 3274
T3 - CEUR Workshop Proceedings
SP - 11
EP - 34
BT - LM-KBC 2022 Knowledge Base Construction from Pre-trained Language Models 2022
A2 - Singhania, Sneha
A2 - Nguyen, Tuan-Phong
A2 - Razniewski, Simon
PB - CEUR-WS.org
T2 - Knowledge Base Construction from Pre-trained Language Models 2022
Y2 - 1 October 2022
ER -