Domain-specific Web site identification: the CROSSMARC focused Web crawler

Konstantinos Stamatakis, Vangelis Karkaletsis, Georgios Paliouras, James Horlock, Claire Grover, James Curran, Shipra Dingare

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

This paper presents techniques for identifying domain specific web sites that have 
been implemented as part of the EC-funded R&D project, CROSSMARC. The project aims to develop technology for extracting interesting information from domain-specific web pages. It is therefore important for CROSSMARC to identify web sites in which interesting domain specific pages reside (focused web crawling). This is the role of the CROSSMARC web crawler.
Original languageEnglish
Title of host publicationProceedings of the 2nd International Workshop on Web Document Analysis (WDA2003)
Pages75-78
Number of pages4
Publication statusPublished - 2003

Fingerprint

Dive into the research topics of 'Domain-specific Web site identification: the CROSSMARC focused Web crawler'. Together they form a unique fingerprint.

Cite this