Abstract 
                      Information extraction from webpages, social networks, news and user interactions  crucially relies on inferring the hidden parameters of interaction between entities. For  instance, in factorization models for movie recommendation we are interested  in the underlying hidden properties of users and movies respectively such  as to suggest new movies. Likewise, when extracting topics from webpages  we want to find the hidden topics representing documents and words. Finally,  when modeling user behavior it is worth while finding the latent factors,  cluster variables, causes, etc. that drive a user's interaction with websites.  
                                              All these problems can be described in a coherent statistical framework.  While much has been published about how to deal with these problems at  moderate sizes, there is little information available on how to perform  efficient scalable estimation at the scale of the internet. In this tutorial  we present both the theory and algorithms for achieving these goals. In  particular, we will describe inference algorithms for collaborative filtering,  recommendation, latent dirichlet allocation, and advanced clustering models.   |