Detecting statements in text: A domain-agnostic few-shot solution

Sandrine Chausson, Björn Ross

Research output: Contribution to conferencePaperpeer-review

Abstract / Description of output

Many tasks related to Computational Social Science and Web Content Analysis involve classifying pieces of text based on the claims they contain. State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce. In light of this, we propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task. This methodology involves defining the classes as arbitrarily sophisticated taxonomies of claims, and using Natural Language Inference models to obtain the textual entailment between these and a corpus of interest. The performance of these models is then boosted by annotating a minimal sample of data points, dynamically sampled using the well-established statistical heuristic of Probabilistic Bisection. We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection. This approach rivals traditional pre-train/fine-tune approaches while drastically reducing the need for data annotation.
Original languageEnglish
DOIs
Publication statusAccepted/In press - 26 Apr 2024
EventNOCAPS - Networks and Opinions on Climate Action in the Public Sphere - Buffalo, United States
Duration: 3 Jun 20243 Jun 2024

Workshop

WorkshopNOCAPS - Networks and Opinions on Climate Action in the Public Sphere
Abbreviated titleNOCAPS 2024
Country/TerritoryUnited States
CityBuffalo
Period3/06/243/06/24

Fingerprint

Dive into the research topics of 'Detecting statements in text: A domain-agnostic few-shot solution'. Together they form a unique fingerprint.

Cite this