Projects per year
Abstract / Description of output
Place name mentions in text may have more than one potential referent (e.g. Peru, the country vs. Peru, the city in Indiana). The Edinburgh Language Technology Group (LTG) has developed the Edinburgh Geoparser, a system that can automatically recognise place name mentions in text and disambiguate them with respect to a gazetteer. The recognition step is required to identify location mentions in a given piece of text. The subsequent disambiguation step, generally referred to as georesolution, grounds location mentions to their corresponding gazetteer entries with latitude and longitude values, for example, to visualise them on a map. Geoparsing is not only useful for mapping purposes but also for making document collections more accessible as it can provide additional metadata about the geographical content of documents. Combined with other information mined from text such as person names and date expressions, complex relations between such pieces of information can be identified. The Edinburgh Geoparser can be used with several gazetteers including Unlock and GeoNames to process a variety of input texts. The original version of the Geoparser was a demonstrator configured for modern text. Since then, it has been adapted to georeference historic and ancient text collections as well as modern-day newspaper text. Currently, the LTG is involved in three research projects applying the Geoparser to historical text collections of very different types and for a variety of end-user applications. This paper discusses the ways in which we have customised the Geoparser for specific datasets and applications relevant to each project.
|Number of pages
|International Journal of Humanities and Arts Computing
|Published - 1 Mar 2015