Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval

K. Tamsin Maxwell, Jon Oberlander, W. Bruce Croft

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.
Original languageEnglish
Title of host publicationProceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Place of PublicationSofia, Bulgaria
PublisherAssociation for Computational Linguistics
Pages507-516
Number of pages10
Publication statusPublished - 1 Aug 2013

Fingerprint Dive into the research topics of 'Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval'. Together they form a unique fingerprint.

Cite this