Accepted Tutorials |www2014

Please click the title of each tutorial for detailed information.

Online Learning and Linked Data – Lessons Learned and Best Practices: Organizers: John Domingue, Stefan Dietze and Alexander Mikroyannidis
MOOCs (Massive Open Online Courses) offer large numbers of students the opportunity to study high quality courses with prestigious universities. These initiatives have led to widespread publicity and also strategic dialogue in the higher education sector. Linked Data has established itself as the de facto means for the publication of structured data over the Web, enjoying amazing growth in terms of the number of organizations committing to use its core principles for exposing and interlinking Big Data for seamless exchange, integration, and reuse. Following these two main trends connecting online learning and Linked Data, the scope of this tutorial is two-fold:
• New online learning methods will be taught for supporting the teaching of Linked Data. Additionally, the lessons learned and the best practices derived from designing and delivering a Linked Data curriculum by the EUCLID project will be discussed.
• Ways in which Linked Data principles and technologies can be used to support online learning and create innovative educational services will be explained, based on the experience developed in the development of existing Linked Data applications for online learning. We will in particular rely on the data catalogue, use cases and applications considered by the LinkedUp project.Read More
Entity Resolution in the Web of Data: Organizers: Kostas Stefanidis, Vasilis Efthymiou, Melanie Herschel and Vassilis ChristophidesThis tutorial provides a comprehensive and cohesive overview of the key research results in the area of entity resolution. We are interested in frameworks addressing the new challenges in entity resolution posed by the Web of data in which real world entities are described by interlinked data rather than documents. Since such descriptions are usually partial, overlapping and sometimes evolving, entity resolution emerges as a central problem both to increase dataset linking, but also to search the Web of data for entities and their relations. Although it has been extensively studied for tabular data in various application contexts, such as merging customer databases and library catalogs, only recently researchers have tackled entity resolution for tree (e.g. XML) and graph (e.g. RDF) data. Special emphasis will be given to MapReduce techniques for entity resolution, since they appear to be a good potential for coping with the huge scale of the Web of data.
Read More
Towards a Social Media Analytics Platform: Event Detection and Description for Microblogs: Organizers: Manish Gupta and Kevin Chang
Microblog data differs significantly from the traditional text data with respect to a variety of dimensions. Microblog data contains short documents, SMS kind of language, and is full of code mixing. Though a lot of it is mere social babble, it also contains fresh news coming from human sensors at a humungous rate. Given such interesting characteristics, the world wide web community has witnessed a large number of research tasks for microblogging platforms recently. Event detection on Twitter is one of the most popular such tasks with a large number of applications. The proposed tutorial on event detection from microblogs will contain two roughly equal parts. In the first part, we will discuss research efforts towards detection of events from Twitter using both the tweet content as well as other external sources. We will also discuss various applications for which event detection mechanisms have been put to use. Merely detecting events is not enough. Applications require that the detector must be able to provide a good description of the event as well. In the second part, we will focus on describing events using the best phrase, event type, event timespan, and a special focus on user and event location prediction. We will conclude with a summary and thoughts on future directions.
Read More
Trust in Social Computing: Organizers: Jiliang Tang and Huan Liu
Social media greatly enables people to participate in online activities and shatters the barrier for online users to create and share information in any place at any time. However, the explosion of user-generated content poses novel challenges for online users to find relevant information, or, in other words, exacerbates the information overload problem. On the other hand, the quality of user-generated content can vary dramatically from excellence to abuse or spam, resulting in a problem of information credibility. The study and understanding of trust can lead to an effective approach to addressing both information overload and credibility problems.
Trust refers to a relationship between a trustor (the subject that trusts a target entity) and a trustee (the entity that is trusted) [1, 2]. In the context of social media, trust provides evidence about with whom we can trust to share information and from whom we can accept information without additional verification. With trust, we make the mental shortcut by directly seeking information from trustees or trusted entities, which serves a two-fold purpose: without being overwhelmed by excessive information (i.e., mitigated information overload) and with credible information due to the trust placed on the information provider (i.e., increased information credibility). Therefore, trust is crucial in helping social media users collect relevant and reliable information, and trust in social media is a research topic of increasing importance and of practical significance.
This tutorial takes a computational perspective to offer an overview of characteristics and elements of trust and illuminate a wide range of computational tasks of trust. It introduces basic concepts, deliberates challenges and opportunities, reviews state-of-the-art algorithms, and elaborates effective evaluation methods in the trust study. In particular, we illustrate properties and representation models of trust, elucidate trust measurements with representative algorithms, and demonstrate real-world applications where trust is explicitly used. As a new dimension of the trust study, we discuss the concept of distrust and its roles in trust measurements and applications. Finally we will summarize the tutorial with discussions on open issues and challenges about trust in social media.Read More
Re-using Media On The Web: Organizers: Lyndon Nixon, Vasileios Mezaris and Raphael Troncy
This tutorial will address the state of the art in the area of online media analysis, annotation and linking, reflecting that a number of Web-based specifications and technologies are now emerging that in combination can provide the technical solution for media owners to be enabled to manage and re-use their online media at a fragment level.
The combination of these specifications and technologies can form a full online media workflow able to support media fragmentation and re-use, which opens means to derive new value from media to media owners and new models for media acquisition and use for media consumers. Hence the awareness of and ability to use these specifications and technologies will be of great importance to future curators and publishers of online media.
Read More
The Mobile Semantic Web: Organizers: Shonali Krishnaswamy and Yuan-Fang Li
The combination of the versatility of smart devices and the capabilities of semantic technologies forms a great foundation for a mobile Semantic Web that will contribute to further realising the true potential of both disciplines. Motivated by a service discovery and matchmaking example, this tutorial provides an overview of background knowledge in ontology languages and basic reasoning problems, and how they are applicable in the mobile environment. It then presents challenges and state-of-the-art development on mobile ontology reasoning, focusing on the reasoning and optimisation techniques developed in the mTableaux framework. Finally, the tutorial closes with an outlook of important research problems.
Read More
Concept-Level Sentiment Analysis: Organizers: Erik Cambria
The half-day tutorial will provide means to efficiently design or handle existing models, techniques, tools, and services for concept-level sentiment analysis and their commercial realizations. The tutorial will also include insights resulting from the two recent IEEE Intelligent Systems special issues on Concept-Level Opinion and Sentiment Analysis and a hands-on session to illustrate how to build a concept-level opinion-mining engine step-by-step, from semantic parsing to concept-level reasoning.
Read More
Social Recommender Systems: Organizers: Ido Guy
Social Recommender Systems (SRS) aim to alleviate information and interaction overload over social media users by presenting the most attractive and relevant information. In this tutorial, we will discuss the key motivations for social media sites to apply recommendation techniques and review the fundamental recommendation approaches. We will then present the key domains for social recommender systems, both in terms of the recommended entities and target audience. Following, we will review methods for handling the cold start problem, incorporating trust and reputation, and providing useful explanations for social recommender systems. We will then discuss several temporal aspects that affect social recommender systems followed by a review of evaluation techniques, with emphasis on the pros and cons of each method. Finally, we will summarize by discussion on open issues and challenges in the field.
Social Spam, Campaigns, Misinformation and Crowdturfing: Organizers: Kyumin Lee, James Caverlee and Calton Pu
The past few years have seen the rapid rise of many successful social systems — from Web-based social networks (e.g., Facebook, LinkedIn) to online social media sites (e.g., Twitter, YouTube) to large-scale information sharing communities (e.g., reddit, Yahoo! Answers) to crowd-based funding services (e.g., Kickstarter, IndieGoGo) to Web-scale crowdsourcing systems (e.g., Amazon MTurk, Crowdflower). However, with this success has come a commensurate wave of new threats, including bot-controlled accounts in social media systems for disseminating malware and commercial spam messages, adversarial propaganda campaigns designed to sway public opinion, collective attention spam targeting popular topics and memes, and propagate manipulated contents. This tutorial will introduce peer-reviewed research work on information quality on social systems. Specifically, we will address new threats such as social spam, campaigns, misinformation and crowdturfing, and overview modern techniques to improve information quality by revealing and detecting malicious participants (e.g., social spammers, content polluters and crowdturfers) and low quality contents. In addition, this tutorial will overview tools to detect these participants.
Scalability and Efficiency Challenges in Large-Scale Web Search Engines: Organizers: B. Barla Cambazoglu and Ricardo Baeza-Yates
Commercial web search engines need to process thousands of queries every second and provide responses to user queries within a few hundred milliseconds. As a consequence of these tight performance constraints, search engines construct and maintain very large compute infrastructures for crawling the Web, indexing discovered pages, and processing user queries. The scalability and efficiency of these infrastructures require careful performance optimizations in every major component of the search engine.This tutorial aims to provide a fairly comprehensive overview of the scalability and efficiency challenges in large-scale web search engines.
In particular, the tutorial provides an in-depth architectural overview of a web search engine, mainly focusing on the web crawling, indexing, and query processing components. The scalability and efficiency issues encountered in the above-mentioned components are presented at four different granularities: at the level of a single computer, a cluster of computers, a single data center, and a multi-center search engine. The tutorial also points at the open research problems and provides recommendations to researchers who are new to the field. The tutorial targets both beginner and novice audience.
E-commerce Product Search: Personalization, Diversification, and beyond: Organizers: Atish Das Sarma, Nish Parikh and Neel Sundaresan
The focus of this tutorial will be e-commerce product search. Several research challenges appear in this context, both from a research standpoint as well as an application standpoint. We will present various approaches adopted in the industry, review well-known research techniques developed over the last decade, draw parallels to traditional web search highlighting the new challenges in this setting, and dig deep into some of the algorithmic and technical approaches developed for this context. A specific approach that will involve a deep dive into literature, theoretical techniques, and practical impact is that of identifying most suited results quickly from a large database, with settings various across cold start users, and those for whom personalization is possible. In this context, top-k and skylines will be discussed specifically as they form a key approach that spans the web, data mining, and database communities and presents a powerful tool for search across multi-dimensional items with clear preferences within each attribute, like product search as opposed to regular web search.
Computational Models for Social Influence Analysis: Organizers: Jie Tang and Jimeng Sun
Social influence is the behavioral change of a person because of the perceived relationship with other people, organizations and society in general. Social influence has been a widely accepted phenomenon in social networks for decades. Many applications have been built based around the implicit notation of social influence between people, such as marketing, advertisement and recommendations. With the exponential growth of online social network services such as Facebook and Twitter, social influence can for the first time be measured over a large population.In this tutorial, we will first introduce the background knowledge for social network analysis, including methodologies and tools for macro-level, meso-level and micro-level social network analysis. Then we will focus on computational models for social influence analysis. Specifically, we will introduce 1) how to verify the existence of social influence in various social networks; 2) how to design computational models for quantifying social influence in large online networks; 3) how to model the cascading process of influence propagation. Finally, we will use several concrete examples (such as opinion leader finding and viral marketing) to demonstrate how social influence can help in real applications.
Learning to Efficiently Rank on Big Data: Organizers: Lidan Wang, Jimmy Lin, Donald Metzler and Jiawei Han
Ranking in response to user queries is a central problem in information retrieval, data mining, and machine learning. In the era of “Big data”, traditional effectiveness-centric ranking techniques tend to get more and more costly (requiring additional hardware/machines) to sustain reasonable ranking speed on large data. The mentality of combating big data by throwing in more hardware/machines will quickly become highly infeasible and expensive since data is growing at an extremely fast rate oblivious to any cost concerns from us. “Learning to efficiently rank” offers a cost-effective solution to ranking on large data (e.g., billions of documents). That is, it addresses a critically important question — whether it is possible to improve ranking effectiveness on large data without incurring (too much) additional costs? This tutorial presents an organized picture on this important topic in the era of “Big data”, ranging from its motivations and background, to three key challenges (model, metric, and learning) and systematic solutions to address the challenges. The framework leads to drastic new models and approaches for cost-effective ranking at scale, which will be discussed in depth in the tutorial. Furthermore, we will present new directions and work in this topic.
A W3C tutorial: HTML5 Apps: Chair: Michel Buffa
“HTML5 simply rocks!” Web content authors will learn how to enhance the user experience of existing Web sites by incrementally using some of the new HTML5 features presented in this W3C tutorial.
Read More
A W3C tutorial: Semantic Web and Linked Data: Chair: Fabien Gandon
“He who controls metadata, controls the Web!”The participants will be guided through the W3C semantic Web stack and its extensions with both an historical perspective and an explanation based on the Web architecture core concepts.
Read More