With massive book digitization efforts underway, the need for effective retrieval of books and pages in books is an important problem. This paper describes our submissions to the INEX 2007 Book Search track. We explored using book specific features such as table of content and index pages and headers along with non-book specific features. Our results show that indexing the entire contents of books and headers provided the most effective retrieval strategy.
|Title of host publication||Focused Access to XML Documents|
|Subtitle of host publication||6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007, Dagstuhl Castle, Germany, December 17-19, 2007. Selected Papers|
|Publisher||Springer Berlin Heidelberg|
|Number of pages||8|
|Publication status||Published - 2007|
|Name||Lecture Notes in Computer Science (LNCS)|
|Publisher|| Springer Berlin Heidelberg|