Learning the Species of Biomedical Named Entities from Annotated Corpora

Xinglong Wang, Claire Grover

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In biomedical articles, terms with the same surface forms are often used to refer to different entities across a number of model organisms, in which case determining the species becomes crucial to term identification systems that ground terms to specific database identifiers. This paper describes a rule-based system that extracts “species indicating words”, such as human or murine, which can be used to decide the species of the nearby entity terms, and a machine-learning species disambiguation system that was developed on manually species-annotated corpora. Performance of both systems were evaluated on gold-standard datasets, where the machine-learning system yielded better overall results.
Original languageEnglish
Title of host publicationProceedings of the International Conference on Language Resources and Evaluation, LREC 2008, 26 May - 1 June 2008, Marrakech, Morocco
Pages1808-1813
Number of pages6
Publication statusPublished - 2008

Fingerprint Dive into the research topics of 'Learning the Species of Biomedical Named Entities from Annotated Corpora'. Together they form a unique fingerprint.

Cite this