International World Wide Web Conference, 28th March - 1st April 2011, Hyderabad, India

Conference Theme

Search

Official Airline

Tutorial :
Web-Based Open-Domain Information Extraction

Tutorial id	tr34
Tutorial name	Web-Based Open-Domain Information Extraction
Presenters	• Marius Pa¸sca	Google Inc. 1600 Amphitheatre Parkway Mountain View, California 94043 mars@google.com





Abstract This tutorial provides an overview of extraction methods developed in the area of Web-based open-domain information extraction, whose purpose is the acquisition of open-domain classes, instances and relations from Web text. The extraction methods operate over unstructured or semi-structured text. They take advantage of weak supervision provided in the form of seed examples or small amounts of annotated data, or draw upon knowledge already encoded within resources created strictly by experts or collaboratively by users. The tutorial teaches the audience about existing resources that include instances and relations; details of methods for extracting such data from structured and semi-structured text available on the Web; and strengths and limitations of resources extracted from text as part of recent literature, with applications in knowledge discovery and information retrieval.

Host

In Association With

IW3C2

Quick Links