CS-TR-05-2: Universal Evaluation Method for Web Clustering Results

Universal Evaluation Method for Web Clustering Results

CS-TR-05-3

Authors: Daniel Crabtree, Xiaoying Gao, Peter Andreae
Source: GZipped PostScript (90kb); Adobe PDF (114kb)

Finding a set of web pages relevant to a user's information goal is difficult due to the enormous size of the internet. Search engines are able to find a set of pages that match the user's query, but refining the results of the search is still difficult and time consuming. Web clustering addresses this problem by presenting the user with clusters of related pages as refinement options. Many clustering algorithms have been developed and researchers need to be able to compare their effectiveness. The lack of a fair universal evaluation method has led to incomparable research and results. This paper identifies the requirements for evaluating the clusters produced by a web clustering algorithm and proposes a new method for a fair universal evaluation of clusterings to meet the requirements. The paper also shows how the new method can evaluate clusterings with diverse characteristics that are not directly comparable by previous methods. Keywords: web clustering, evaluation

[Up to Computer Science Technical Report Archive: Home Page]