–Continuous score for
incorporation with other features
(1)
Since most
tables contain a lot of text, we decided to explore the possibility of
deriving a feature by treating it as a text classification problem. Text
classification is a well studied problem in IR and many algorithms have been
proposed. However, there are many special characteristics in our particular
application.
…. Finally,
since this is only one of the many features, we need to have a continuous
score as opposed to a binary decision in order to incorporate it with other
features. Considering all these, we designed a feature based on the vector
space model and we call it the word group feature.