Fifth International World Wide Web Conference
Harvesting the Web
- Harvest: Distributed Gatherer and Broker Architecture
- Gatherer:
- Retrieves resources via URLs (HTTP, FTP, Gopher, etc.)
- Translates between file formats (HTML, TeX, ASCII, etc.)
- Generates indexing data for Broker
- Broker:
- Retrieves indexing data from Gatherer
- Provides search interface for end-users (or other Brokers)
- Growing adoption (e.g., comp.infosystems.harvest, Netscape Catalog Server, etc.)
Internet Indexing
(Darren Hardy)