(Big) Usage Data in Web Search

Presenters :  Ricardo Baeza-Yates, Yahoo! Labs, Spain
Yoelle Maarek, Yahoo! Labs, Israel

Web Search tremendously evolved over the last 20 years. The field encountered its first revolution when it started to deal with huge amounts of Web pages. Then, a major step was accomplished when the structure of the Web graph was taken into consideration and link analysis methods were invented to improve both crawling and ranking. More recently, search engines started to monitor and mine the huge amounts of signals provided by users while searching, such as query strings, clicks, or even mouse cursors movements. In this tutorial, we focus on this last step of exploiting usage data at a large scale. ?We will first consider the various forms big usage data takes in its raw or augmented forms, with special attention to query/clicks logs. Then we will review the numerous key Web Search applications (some now retired) that usage data made possible. Finally, we will discuss its limitations and more specifically three factors that often pull in opposite directions: the size of the data, personalization needs and privacy concerns. We will conclude by offering some possible ways to circumvent some limitations through different types of aggregation. This half-day tutorial will be conducted in the form of an advanced graduate class (minus the assignments). Active participation from the audience will be strongly encouraged.

Slides: http://www.baeza.cl/ftp/BigUsageDataInWebSearch-Tutorial-WWW2013-handout.pdf