Weakly supervised domain detection

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper we introduce domain detection as a new natural language processing task. We argue that the ability to detect textual segments that are domain-heavy (i.e., sentences or phrases that are representative of and provide evidence for a given domain) could enhance the robustness and portability of various text classification applications. We propose an encoder-detector framework for domain detection and bootstrap classifiers with multiple instance learning. The model is hierarchically organized and suited to multilabel classification. We demonstrate that despite learning with minimal supervision, our model can be applied to text spans of different granularities, languages, and genres. We also showcase the potential of domain detection for text summarization.
Original languageEnglish
Pages (from-to)581-596
Number of pages16
JournalTransactions of the Association for Computational Linguistics
Volume7
DOIs
Publication statusPublished - 1 Sept 2019

Fingerprint

Dive into the research topics of 'Weakly supervised domain detection'. Together they form a unique fingerprint.

Cite this