Named Entity Recognition without Gazetteers

Andrei Mikheev, Marc Moens, Claire Grover

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

It is often claimed that Named Entity recognition systems need extensive gazetteers---lists of names of people, organisations, locations, and other named entities. Indeed, the compilation of such gazetteers is sometimes mentioned as a bottleneck in the design of Named Entity recognition systems.We report on a Named Entity recognition system which combines rule-based grammars with statistical (maximum entropy) models. We report on the system's performance with gazetteers of different types and different sizes, using test material from the MUC-7 competition. We show that, for the text type and task of this competition, it is sufficient to use relatively small gazetteers of well-known names, rather than large gazetteers of low-frequency names. We conclude with observations about the domain independence of the competition and of our experiments.
Original languageEnglish
Title of host publicationEACL 1999, 9th Conference of the European Chapter of the Association for Computational Linguistics, June 8-12, 1999, University of Bergen, Bergen, Norway
Pages1-8
Number of pages8
DOIs
Publication statusPublished - 1999

Fingerprint

Dive into the research topics of 'Named Entity Recognition without Gazetteers'. Together they form a unique fingerprint.

Cite this