Next: Introduction
This work was performed when the authors were employees of AT&T
Labs. Current contact information for both authors: Google, Inc., 2400
Bayshore Pkwy., Mountain View, CA 94043, USA, {singhal,
martink}@google.com
A Case Study in Web Search using TREC Algorithms
Amit Singhal
AT&T Labs -- Research
180 Park Avenue
Florham Park, NJ 07932, USA
Marcin Kaszkiel
AT&T Labs -- Research
180 Park Avenue
Florham Park, NJ 07932, USA
Copyright is held by the author/owner(s).
WWW10, May 1-5, 2001, Hong Kong.
ACM 1-58113-348-0/01/0005.
Abstract:
Web search engines rank potentially relevant pages/sites for a user
query. Ranking documents for user queries has also been at the heart
of the Text REtrieval Conference (TREC in short) under the label ad-hoc retrieval. The TREC community has developed document
ranking algorithms that are known to be the best for searching the
document collections used in TREC, which are mainly comprised of
newswire text. However, the web search community has developed its own
methods to rank web pages/sites, many of which use link structure on
the web, and are quite different from the algorithms developed at
TREC. This study evaluates the performance of a state-of-the-art
keyword-based document ranking algorithm (coming out of TREC) on a
popular web search task: finding the web page/site of an entity, e.g. companies, universities, organizations, individuals, etc. This
form of querying is quite prevalent on the web. The results from the
TREC algorithms are compared to four commercial web search
engines. Results show that for finding the web page/site of an entity,
commercial web search engines are notably better than a
state-of-the-art TREC algorithm. These results are in sharp contrast
to results from several previous studies.
Keywords: Search engines, TREC ad-hoc, keyword-based
ranking, link-based ranking
Next: Introduction
Amit Singhal
2001-02-18