Abstract
Standard test collections form the very basis of Information Retrieval research and evaluation. Important datasets have been created to promote empirical research and experimentation. In this paper, we describe our endeavour in creating a test collection from old, archived writings of IR stalwarts. The documents are created in text format from the scanned and OCRed version. The test collection consists of a set of documents in TREC format along with a set of expert queries and their relevance assessments. This dataset, though small in size, would be of paramount interest for researchers and students of IR since it contains valuable discourses on the discipline from its very inception. Also, to the best of our knowledge, no standard IR dataset has been built so far comprising old research articles. Furthermore, this is a dataset without the original error-free digital text version. So, the resulting collection would expect researchers to run retrieval experiments on the erroneous collection without the scope of error modeling. This would invite new research ideas.
Original language | English |
---|---|
Title of host publication | Proceedings of the Forum for Information Retrieval Evaluation |
Editors | Prasenjit Majumder, Mandar Mitra, Sukomal Pal, Madhulika Agrawal, Parth Mehta |
Place of Publication | New York, NY, USA |
Publisher | Association for Computing Machinery (ACM) |
Pages | 121–125 |
Number of pages | 5 |
ISBN (Print) | 9781450337557 |
DOIs | |
Publication status | Published - 5 Dec 2014 |
Event | 6th workshop of the Forum for Information Retrieval Evaluation - Bangalore, India Duration: 5 Dec 2014 → 7 Dec 2014 Conference number: 6 |
Publication series
Name | FIRE '14 |
---|---|
Publisher | Association for Computing Machinery |
Workshop
Workshop | 6th workshop of the Forum for Information Retrieval Evaluation |
---|---|
Abbreviated title | FIRE 2014 |
Country/Territory | India |
City | Bangalore |
Period | 5/12/14 → 7/12/14 |
Keywords
- Test Collection
- Old Literature
- OCR Errors