Abstract
This tutorial provides an overview of extraction methods developed in the area of Web-based open-domain information
extraction, whose purpose is the acquisition of open-domain
classes, instances and relations from Web text. The extraction methods operate over unstructured or semi-structured
text. They take advantage of weak supervision provided
in the form of seed examples or small amounts of annotated data, or draw upon knowledge already encoded within
resources created strictly by experts or collaboratively by
users. The tutorial teaches the audience about existing
resources that include instances and relations; details of
methods for extracting such data from structured and semi-structured text available on the Web; and strengths and
limitations of resources extracted from text as part of recent
literature, with applications in knowledge discovery and information retrieval.
|