SICK-BR: A Portuguese Corpus for Inference

Livy Real, Ana Rodrigues, Andressa Vieira e Silva, Beatriz Albiero, Bruna Thalenberg, Bruno Guide, Cindy Silva, Guilherme de Oliveira Lima, Igor C. S. Câmara, Milos Stanojevic, Rodrigo Souza, Valeria de Paiva

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We describe SICK-BR, a Brazilian Portuguese corpus annotated with inference relations and semantic relatedness between pairs of sentences. SICK-BR is a translation and adaptation of the original SICK, a corpus of English sentences used in several semantic evaluations. SICK-BR consists of around 10k sentence pairs annotated for neutral/contradiction/entailment relations and for semantic relatedness, using a 5 point scale. Here we describe the strategies used for the adaptation of SICK, which preserve its original inference and relatedness relation labels in the SICK-BR Portuguese version. We also discuss some issues with the original corpus and how we might deal with them.
Original languageEnglish
Title of host publicationProceedings of the 13th International Conference of Computational Processing of the Portuguese Language (PROPOR 2018)
Place of PublicationCanela, Brazil
PublisherSpringer, Cham
Number of pages10
ISBN (Electronic)978-3-319-99722-3
ISBN (Print)978-3-319-99721-6
Publication statusE-pub ahead of print - 26 Aug 2018
Event13th International Conference on the Computational Processing of Portuguese - Canela, Brazil
Duration: 24 Sep 201826 Sep 2018

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameLecture Notes in Artificial Intelligence


Conference13th International Conference on the Computational Processing of Portuguese
Abbreviated titlePROPOR 2018
Internet address


Dive into the research topics of 'SICK-BR: A Portuguese Corpus for Inference'. Together they form a unique fingerprint.

Cite this