Abstract
We describe work on large scale automatic annotation of full texts of books with social tags. Our task consisted of assigning tags to the full texts of works of fiction and evaluating them against tags assigned by humans. We compared Boosting and Relevance Models (RM) methods to explore how they differ primarily in terms scalability and also annotation quality. We extended beyond the set of 50 tags used in earlier work to sets ranging up to 10,000 tags. We show how a RM based algorithm scales significantly better than a Boosting based algorithm when dealing with large sets of tags.
Original language | English |
---|---|
Title of host publication | Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Jose, California, USA, May 17-20, 2009 |
Publisher | The AAAI Press |
Pages | 210-213 |
Number of pages | 4 |
Publication status | Published - Mar 2009 |