Next:Vector Space Model (VSM)Up:Improvement of HITS-based AlgorithmsPrevious:A New Weighted HITS-based

Combining the HITS-based Algorithms with Relevance Scoring Methods

The hub value and authority value are regulated in the way proposed in [1]. If $s_i$ is the relevance score of a Web page $i$ and $h_i$ the hub value,$s_i \cdot h_i$ instead of $h_i$ is used to compute the authority values of Web pages it points to. Similarly, if $a_i$ is its authority value,$s_i \cdot a_i$ instead of $a_i$ is used to compute the hub values of Web pages that point to it.

We combine four relevance scoring methods, VSM, TLS, Okapi, and CDR, with a HITS-based algorithm, respectively. In VSM, cosine normalization is used to normalize the relevance scores to range [0,1]. For popular topic, the relevance scores of the single term query, such as blues, will be 1 for almost all the Web pages using both VSM and TLS. In this case, there is almost no difference between the HITS-based algorithm and the combination the HITS-based algorithm with VSM or TLS. Next, we present the four relevance scoring methods and how they are applied to the Web-based content analysis.

Next:Vector Space Model (VSM)Up:Improvement of HITS-based AlgorithmsPrevious:A New Weighted HITS-based