Now that we have
the features, we need to decide on the classifiers to use. We picked two
popular classifiers to experiment with. Decision tree was selected because as
you might have noticed, our feature set is highly non-homogeneous. Some
feature are integers, some are floating point numbers. … Decision trees can
handle that quite well. Secondly, decision trees have been widely used … and
have proved to perform very well in that context. We also wanted to
experiment with Support Vector Machines because they have a strong
theoretical foundation based on structural risk minimization, and it was
shown to be the best performer in a recent text categorization test.