|
Here is a
summary of our feature set. There are three groups of features. The first
group is extracted completely from the physical structure of the table. The
second group looks at the content type of the cells in the table – the
simplest kind of content analysis. And finally, the text content feature is
designed to exploit the textual information within a table and look at it
from a text classification angle. Next I am going to explain each group of
features in detail.
|