Nested Named Entity Recognition in Historical Archive Text

Kate Byrne

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

This paper describes work on Named Entity Recognition (NER), in preparation for Relation Extraction (RE), on data from a historical archive organisation. As is often
the case in the cultural heritage domain, the source text includes a high percentage of specialist terminology, and is of very variable quality in terms of grammaticality and completeness. The NER and RE tasks were carried out using a
specially annotated corpus, and are themselves preliminary steps in a larger project whose aim is to transform discovered relations into a graph structure that can be queried using standard tools. Experimental results from the NER task
are described, with emphasis on dealing with nested entities using a multi-word token method. The overall objective is to improve access by non-specialist users to a valuable cultural resource.
Original languageEnglish
Title of host publicationSemantic Computing, 2007. ICSC 2007. International Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages589-596
Number of pages8
ISBN (Print)978-0-7695-2997-4
DOIs
Publication statusPublished - 2007

Fingerprint

Dive into the research topics of 'Nested Named Entity Recognition in Historical Archive Text'. Together they form a unique fingerprint.

Cite this