2
Motivation
§Tables are common in HTML documents
§Table understanding important
–Knowledge management
–Web mining
–Summarization and mobile access
§<table></table> tags cannot be trusted
§Table detection
–Is a given <table> element a genuine table?
–
–
§
(1.5)
 First I will explain briefly the motivation behind this work. *Most web publications are written in HTML.  Tables are commonly used in such documents to present relational data. Since they are inherently information rich, table understanding has many important applications on the web. Although most tables in HTML documents are marked by the table tags, the inverse is not true – not all table elements indicate true relational tables. In fact, most of the time the table tag is used simply to achieve a multi-column layout effect. So the first step in table understanding on the web is table detection. In this study, we concentrate on tables that are coded using the table tag. So by table detection we refer to the process of determining whether a given table element represents a genuine table.