A Multi-strategy Learning Approach to Competitor Identification

Tong Ruan, Yeli Lin, Haofen Wang, Jeff Z. Pan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Competitor identification tries to find competitors of some entity in a given field, which is the key to the success of market intelligence. Manually collecting competitors is labor-intensive and time consuming. So automatic approaches are proposed for this purpose. However, these approaches suffer from the following two main challenges. Competitor information might not only be contained in semi-structured sources like lists or tables, but also be mentioned in free texts. The diversity of its sources make competitor identification quite difficult. Also, these competitors might not always occur in form of their full names. The occurrences of name variants further increase the diversity, and make the task more challenging. In this paper, we propose a novel unsupervised approach to identify competitors from prospectuses based on a multi-strategy learning algorithm. More precisely, we first extract competitors from lists using some predefined heuristic rules. By leveraging redundancies among competitor information in lists, tables, and texts, these competitors are fed as seeds to distantly supervise the learning process to find table columns and text patterns containing competitors. The whole process is iteratively performed. In each iteration, the newly discovered competitors of high confidence from various sources are treated as new seeds for bootstrapping. The experimental results show the effectiveness of our approach without human intentions and external knowledge bases. Moreover, the approach significantly outperforms traditional named entity recognition approaches.
Original languageEnglish
Title of host publicationSemantic Technology
Subtitle of host publication4th Joint International Conference, JIST 2014, Chiang Mai, Thailand, November 9-11, 2014. Revised Selected Papers
EditorsThepchai Supnithi, Takahira Yamaguchi, Jeff Z. Pan, Vilas Wuwongse, Marut Buranarach
Place of PublicationCham
PublisherSpringer
Pages197-212
Number of pages16
ISBN (Electronic)978-3-319-15615-6
ISBN (Print)978-3-319-15614-9
DOIs
Publication statusPublished - 21 Feb 2015
Event4th Joint International Conference on Semantic Technology, JIST 2014 - Chiang Mai, Thailand
Duration: 9 Nov 201411 Nov 2014

Publication series

NameLecture Notes in Computer Science
PublisherSpringer, Cham
Volume8943
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th Joint International Conference on Semantic Technology, JIST 2014
Country/TerritoryThailand
CityChiang Mai
Period9/11/1411/11/14

Keywords / Materials (for Non-textual outputs)

  • Competitor mining
  • Unsupervised learning
  • Distant supervision
  • Wrapper induction

Fingerprint

Dive into the research topics of 'A Multi-strategy Learning Approach to Competitor Identification'. Together they form a unique fingerprint.

Cite this