Abstract
This paper presents techniques for identifying domain specific web sites that have
been implemented as part of the EC-funded R&D project, CROSSMARC. The project aims to develop technology for extracting interesting information from domain-specific web pages. It is therefore important for CROSSMARC to identify web sites in which interesting domain specific pages reside (focused web crawling). This is the role of the CROSSMARC web crawler.
been implemented as part of the EC-funded R&D project, CROSSMARC. The project aims to develop technology for extracting interesting information from domain-specific web pages. It is therefore important for CROSSMARC to identify web sites in which interesting domain specific pages reside (focused web crawling). This is the role of the CROSSMARC web crawler.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd International Workshop on Web Document Analysis (WDA2003) |
Pages | 75-78 |
Number of pages | 4 |
Publication status | Published - 2003 |