Proceedings of the Sixteenth International World Wide Web Conference
(WWW2007)
May 8-12, 2007
Banff, Alberta, CANADA
PAPERS
Track: Browsers and User Interfaces
Session: Personalization
Homepage Live: Automatic Block Tracing for Web Personalization 1
J. Han, D. Han (Shanghai Jiao-Tong University),
C. Lin, H.-J. Zeng, Z. Chen (Microsoft Research Asia),
Y. Yu (Shanghai Jiao-Tong University)
Open User Profiles for Adaptive News Systems: Help or Harm? 11
J.-w. Ahn, P. Brusilovsky, J. Grady, D. He, S. Y. Syn (University of Pittsburgh)
Investigating Behavioral Variability in Web Search 21
R. W. White (Microsoft Research), S. M. Drucker (Microsoft Live Laboratories)
Session: Smarter Browsing
CSurf: A Context-Driven Non-Visual Web-Browser 31
J. Mahmud, Y. Borodin, I. V. Ramakrishnan (Stony Brook University)
GeoTracker: Geospatial and Temporal RSS Navigation 41
Y.-F. Chen, G. Di Fabbrizio, D. Gibbon, R. Jana, S. Jora, B. Renger, B. Wei (AT&T Laboratories – Research)
Learning Information Intent via Observation 51
A. Tomasic, I. Simmons, J. Zimmerman (Carnegie Mellon University)
Track: Data Mining
Session: Identifying Structure in Web Pages
Page-level Template Detection via Isotonic Smoothing 61
D. Chakrabarti, R. Kumar (Yahoo! Research),
K. Punera (University of Texas at Austin)
Towards Domain-Independent Information Extraction from Web Tables 71
W. Gatterbauer, P. Bohunsky, M. Herzog, B. Krüpl, B. Pollak (Vienna University of Technology)
Web Object Retrieval 81
Z. Nie, Y. Ma, S. Shi, J.-R. Wen, W.-Y. Ma (Microsoft Research Asia)
Session: Mining Textual Data
Summarizing Email Conversations with Clue Words 91
G. Carenini, R. T. Ng, X. Zhou (University of British Columbia)
Organizing and Searching the World Wide Web of Facts — Step Two: Harnessing the Wisdom of the Crowds 101
M. Paşca (Google Inc.)
Do Not Crawl in the DUST: Different URLs with Similar Text 111
Z. Bar-Yossef (Technion and Google Haifa Engineering Center),
I. Keidar (Technion),
U. Schonfeld (University of California at Los Angeles)
Session: Similarity Search
A New Suffix Tree Similarity Measure for Document Clustering 121
H. Chim, X. Deng (City University of Hong Kong)
Scaling Up All Pairs Similarity Search 131
R. J. Bayardo (Google, Inc.), Y. Ma (University of California at Irvine), R. Srikant (Google, Inc.)
Detecting Near-Duplicates for Web Crawling 141
G. S. Manku, Jain (Google Inc.), A. D. Sarma (Stanford University)
Session: Predictive Modeling of Web Users
Demographic Prediction Based on User's Browsing Behavior 151
J. Hu, H.-J. Zeng, H. Li, C. Niu, Z. Chen (Microsoft Research Asia)
Why We Search: Visualizing and Predicting User Behavior 161
E. Adar, D. S. Weld, B. N. Bershad, S. D. Gribble (University of Washington)
Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs 171
Q. Mei, X. Ling, M. Wondra (University of Illinois at Urbana-Champaign),
H. Su (Vanderbilt University),
CX. Zhai (University of Illinois at Urbana-Champaign)
Session: Mining in Social Networks
Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography 181
L. Backstrom, (Cornell University),
C. Dwork (Microsoft Research),
J. Kleinberg (Cornell University)
Information Flow Modeling based on Diffusion Rate for Prediction and Ranking 191
X. Song, Y. Chi, K. Hino, B. L. Tseng (NEC Laboratories America)
NetProbe: A Fast& Scalable System for Fraud Detection in Online Auction Networks 201
S. Pandit, D. H. Chau, S. Wang, C. Faloutsos (Carnegie Mellon University)
Track: E* Applications
Session: E-Communities
The Complex Dynamics of Collaborative Tagging 211
H. Halpin (University of Edinburgh),
V. Robu (CWI, Center for Mathematics and Computer Science),
H. Shepherd (Princeton University)
Expertise Networks in Online Communities: Structure and Algorithms 221
J. Zhang, M. S. Ackerman, L. Adamic (University of Michigan)
Internet-Scale Collection of Human-Reviewed Data 231
Q. Su, D. Pavlov, J.-H. Chow, W. C. Baker (Yahoo! Inc.)
Session: E-Commerce and E-Content
DETECTIVES: DETEcting Coalition hiT Inflation attacks in adVertising nEtworks Streams 241
A. Metwally, D. Agrawal, A. El Abbadi (University of California at Santa Barbara)
Extraction and Search of Chemical Formulae in Text Documents on the Web 251
B. Sun, Q. Tan, P. Mitra, C. L. Giles (The Pennsylvania State University)
A Content-Driven Reputation System for the Wikipedia 261
B. T. Adler, L. de Alfaro (University of California at Santa Cruz)
Track: Industrial Practice & Experience
Google News Personalization: Scalable Online Collaborative Filtering 271
A. Das, M. Datar, A. Garg (Google Inc.),
S. Rajaram (University of Illinois at Urbana-Champaign)
Exploring in the Weblog Space by Detecting Informative and Affective Articles 281
X. Ni, G.-R. Xue, X. Ling, Y. Yu (Shanghai Jiao-Tong University),
Q. Yang (Hong Kong University of Science & Technology)
Spam Double-Funnel: Connecting Web Spammers with Advertisers 291
Y.-M. Wang, M. Ma (Microsoft Research),
Y. Niu, H. Chen (University of California at Davis)
Track: Performance and Scalability
Session: Scalable Systems for Dynamic Content
GlobeTP: Template-Based Database Replication for Scalable Web Applications 301
T. Groothuyse, S. Sivasubramanian, G. Pierre (Vrije Universiteit)
Consistency-preserving Caching of Dynamic Database Content 311
N. Tolia, M. Satyanarayanan (Carnegie Mellon University)
Optimized Query Planning of Continuous Aggregation Queries in Dynamic Data Dissemination Networks 321
R. Gupta (IBM India Research Laboratory),
K. Ramamritham (Indian Institute of Technology)
Session: Performance Engineering of Web Applications
A Scalable Application Placement Controller for Enterprise Data Centers 331
C. Tang, M. Steinder, M. Spreitzer, G. Pacifici (IBM T.J. Watson Research Center)
A Unified Platform for Data Driven Web Applications with Automatic Client-Server Partitioning 341
F. Yang, N. Gupta, N. Gerner, X. Qi, A. Demers, J. Gehrke (Cornell University),
J. Shanmugasundaram (Yahoo!)
MyXDNS: A Request Routing DNS Server with Decoupled Server Selection 351
H. A. Alzoubi, M. Rabinovich (Case Western Reserve University),
O. Spatscheck (AT&T Research Laboratories)
Track: Pervasive Web and Mobility
Robust Web Page Segmentation for Mobile Terminal Using Content-Distances and Page Layout Information 361
G. Hattori, K. Hoashi, K. Matsumoto, F. Sugaya (KDDI R&D Laboratories),
PRIVÉ: Anonymous Location-Based Queries in Distributed Mobile Systems 371
G. Ghinita, P. Kalnis (National University of Singapore),
S. Skiadopoulos (University of Peloponnese)
A Mobile Application Framework for the Geospatial Web 381
R. Simon, P. Fröhlich (Telecommunications Research Center Vienna)
Track: Search
Session: Search Potpourri
Navigation-Aided Retrieval 391
S. Pandit (Carnegie Mellon University),
C. Olston (Yahoo! Research)
Efficient Search Engine Measurements 401
Z. Bar-Yossef, M. Gurevich (Technion - Israel Institute of Technology)
Efficient Search in Large Textual Collections with Redundancy 411
J. Zhang, T. Suel (Polytechnic University)
Session: Crawlers
The Discoverability of the Web 421
A. Dasgupta, A. Ghosh, R. Kumar, C. Olston, S. Pandey, A. Tomkins (Yahoo! Research)
Combining Classifiers to Identify Online Databases 431
L. Barbosa, J. Freire (University of Utah)
An Adaptive Crawler for Locating Hidden-Web Entry Points 441
L. Barbosa, J. Freire (University of Utah)
Session: Web Graphs
Random Web Crawls 451
T. Bennouas (Criteo R&D),
F. de Montgolfier (LIAFA - Université Paris 7)
Extraction and Classification of Dense Communities in the Web 461
Y. Dourisboure, F. Geraci, M. Pellegrini (Istituto di Informatica e Telematica)
Web Projections: Learning from Contextual Subgraphs of the Web 471
J. Leskovec (Carnegie Mellon University),
S. Dumais, E. Horvitz (Microsoft Research)
Session: Search Quality and Precision
Supervised Rank Aggregation 481
Y.-T. Liu (Microsoft Research Asia & Beijing Jiaotong University),
T.-Y. Liu (Microsoft Research Asia),
T. Qin (Microsoft Research Asia & Tsinghua University),
Z.-M. Ma (Chinese Academy of Science),
H. Li (Microsoft Research Asia)
Navigating the Intranet with High Precision 491
H. Zhu (IBM Almaden Research Center),
A. Löser (SAP Research CEC Dresden),
S. Raghavan, S. Vaithyanathan (IBM Almaden Research Center)
Optimizing Web Search Using Social Annotations 501
S. Bao, X. Wu (Shanghai JiaoTong University),
B. Fei (IBM China Research Laboratory),
G. Xue (Shanghai JiaoTong University),
Z. Su (IBM China Research Laboratory),
Y. Yu (Shanghai JiaoTong University)
Session: Advertisements & Click Estimates
Robust Methodologies for Modeling Web Click Distributions 511
K. Ali, M. Scarr (Yahoo!)
Predicting Clicks: Estimating the Click-Through Rate for New Ads 521
M. Richardson (Microsoft Research),
E. Dominowska (Microsoft),
R. Ragno (Microsoft Research)
Dynamics of Bid Optimization in Online Advertisement Auctions 531
C. Borgs, J. Chayes (Microsoft Research),
O. Etesami (University of California at Berkeley),
N. Immorlica, K. A. Jain (Microsoft Research),
M. Mahdian (Yahoo! Research)
Session: Knowledge Discovery
Compare&Contrast: Using the Web to Discover Comparable Cases for News Stories 541
J. Liu, E. Wagner, L. Birnbaum (Northwestern University)
Answering Bounded Continuous Search Queries in the World Wide Web 551
D. Kukulenz (Institute of Information Systems),
A. Ntoulas (Microsoft Search Laboratories)
Answering Relationship Queries on the Web 561
G. Luo, C. Tan, Y.-l. Tian (IBM T.J. Watson Research Center)
Session: Personalization
Dynamic Personalized Pagerank in Entity-Relation Graphs 571
S. Chakrabarti (IIT Bombay)
A Large-scale Evaluation and Analysis of Personalized Search Strategies 581
Z. Dou (Nankai University),
R. Song, J.-R. Wen (Microsoft Research Asia)
Privacy-Enhancing Personalized Web Search 591
Y. Xu (Simon Fraser University),
B. Zhang, Z. Chen (Microsoft Research Asia),
K. Wang (Simon Fraser University)
Track: Security, Privacy, Reliability, & Ethics
Session: Defending Against Emerging Threats
Defeating Script Injection Attacks with Browser-Enforced Embedded Policies 601
T. Jim (AT&T Laboratories – Research),
N. Swamy, M. Hicks (University of Maryland)
Subspace: Secure Cross-Domain Communication for Web Mashups 611
C. Jackson (Stanford University),
H. J. Wang (Microsoft Research)
Exposing Private Information by Timing Web Applications 621
A. Bortz, D. Boneh (Stanford University), P. Nandy
On Anonymizing Query Logs via Token-based Hashing 629
R. Kumar, J. Novak, B. Pang, A. Tomkins (Yahoo! Research)
Session: Passwords and Phishing
CANTINA: A Content-Based Approach to Detecting Phishing Web Sites 639
Y. Zhang (University of Pittsburgh),
J. Hong, L. Cranor (Carnegie Mellon University)
Learning to Detect Phishing Emails 649
I. Fette, N. Sadeh, A. Tomasic (Carnegie Mellon Univ.)
A Large-Scale Study of Web Password Habits 657
D. Florêncio, C. Herley (Microsoft Research)
Session: Access Control and Trust on the Web
A Fault Model and Mutation Testing of Access Control Policies 667
E. Martin, T. Xie (North Carolina State University)
Analyzing Web Access Control Policies 677
V. Kolovski, J. Hendler (University of Maryland),
B. Parsia (University of Manchester)
Compiling Cryptographic Protocols for Deployment on the Web 687
J. McCarthy (Brown University),
J. D. Guttman, J. D. Ramsdell (MITRE Corporation),
S. Krishnamurthi (Brown University)
Track: Semantic Web
Session: Ontologies
YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia 697
F. M. Suchanek, G. Kasneci, G. Weikum (Max-Planck-Institut)
Ontology Summarization Based on RDF Sentence Graph 707
X. Zhang, G. Cheng, Y. Qu (Southeast University)
Just the Right Amount: Extracting Modules from Ontologies 717
B. C. Grau, I. Horrocks, Y. Kazakov, U. Sattler (The University of Manchester)
Session: Applications
Toward Expressive Syndication on the Web 727
C. Halaschek-Wiener, J. Hendler (University of Maryland)
Exhibit: Lightweight Structured Data Publishing 737
D. F. Huynh, D. R. Karger, R. C. Miller (Massachusetts Institute of Technology)
Explorations in the Use of Semantic Web Technologies for Product Information Management 747
J.-S. Brunner, L. Ma, C. Wang, L. Zhang (IBM China Research Laboratory),
D. C. Wolfson (IBM Software Group),
Y. Pan (IBM China Research Laboratory),
K. Srinivas (IBM T.J. Watson Research Center)
Session: Similarity and Extraction
Measuring Semantic Similarity between Words Using Web Search Engines 757
D. Bollegala (The University of Tokyo),
Y. Matsuo (National Institute of Advanced Industrial Science & Technology),
M. Ishizuka (The University of Tokyo)
Using Google Distance to Weight Approximate Ontology Matches 767
R. Gligorov, Z. Aleksovski, W. ten Kate (Philips Research),
F. van Harmelen (Vrije Universiteit)
Hierarchical, Perceptron-like Learning for Ontology-Based Information Extraction 777
Y. Li, K. Bontcheva (University of Sheffield)
Session: Query Languages and DBs
From SPARQL to Rules (and back) 787
A. Polleres (Universidad Rey Juan Carlos)
SPARQ2L: Towards Support for Subgraph Extraction Queries in RDF Databases 797
K. Anyanwu, A. Maduko (University of Georgia),
A. Sheth (Wright State University)
Bridging the Gap Between OWL and Relational Databases 807
B. Motik, I. Horrocks, U. Sattler (University of Manchester)
ActiveRDF: Object-Oriented Semantic Web Programming 817
E. Oren, R. Delbru, S. Gerke, A. Haller, S. Decker (National University of Ireland)
Session: Semantic Web and Web 2.0
The Two Cultures: Mashing up Web 2.0 and the Semantic Web 825
A. Ankolekar, M. Krötzsch, T. Tran, D. Vrandecic (Universität Karlsruhe)
Analysis of Topological Characteristics of Huge Online Social Networking Services 835
Y.-Y. Ahn, S. Han, H. Kwak, S. Moon, H. Jeong (KAIST)
P-TAG: Large Scale Automatic Generation of Personalized Annotation TAGs for the Web 845
P.-A. Chirita, S. Costache (University of Hannover),
S. Handschuh (National University of Ireland),
W. Nejdl (University of Hannover)
Track: Technology for Developing Regions
Session: Communication in Developing Regions
Connecting the 'Bottom of the Pyramid' – An Exploratory Case Study of India's Rural Communication Environment 855
S. Seshagiri, A. Sagar, D. Joshi (Motorola India Research Laboratories)
Communication as Information-Seeking: The Case for Mobile Social Software for Developing Regions 863
B. E. Kolko, E. J. Rose, E. Johnson (University of Washington)
Optimal Audio-Visual Representations for Illiterate Users of Computers 873
I. Medhi, A. Prasad, K. Toyama (Microsoft Research Laboratories India)
Session: Networking Issues in the Web
Identifying and Discriminating Between Web and Peer-to-Peer Traffic in the Network Core 883
J. Erman, A. Mahanti, M. Arlitt, C. Williamson (University of Calgary)
Long Distance Wireless Mesh Network Planning: Problem Formulation and Solution 893
S. Sen, B. Raman (IIT Kanpur)
Is High-Quality VoD Feasible using P2P Swarming? 903
S. Annapureddy (New York University),
S. Guha (Cornell University),
C. Gkantsidis, D. Gunawardena (Microsoft Research),
P. Rodriguez (Telefonica Research)
Track: Web Engineering
Session: Web Modeling
Turning Portlets into Services: The Consumer Profile 913
O. Díaz, S. Trujillo, S. Pérez (University of the Basque Country)
A Framework for Rapid Integration of Presentation Components 923
J. Yu, B. Benatallah, R. Saint-Paul (University of New South Wales),
F. Casati (University of Trento),
F. Daniel, M. Matera (Politecnico di Milano)
Integrating Value-based Requirement Engineering Models to WebML using VIP Business Modeling Framework 933
F. Azam, Z. Li, R. Ahmad (Beijing University of Aeronautics & Astronautics)
Session: End-User Perspectives and Measurement in Web Engineering
Towards Effective Browsing of Large Scale Social Annotations 943
R. Li, S. Bao (Shanghai JiaoTong University),
B. Fei, Z. Su (IBM China Research Laboratory),
Y. Yu (Shanghai JiaoTong University)
Supporting End-Users in the Creation of Dependable Web Clips 953
S. Lingam, S. Elbaum (University of Nebraska-Lincoln)
Effort Estimation: How Valuable is it for a Web Company to Use a Cross-company Data Set, Compared to Using Its Own Single-company Data Set? 963
E. Mendes (The University of Auckland),
S. Di Martino, F. Ferrucci, C. Gravino (Univ. di Salerno)
Track: Web Services
Session: Orchestration and Choreography
Towards the Theoretical Foundation of Choreography 973
Z. Qui, X. Zhao, C. Cai, H. Yang (Peking University)
Introduction and Evaluation of Martlet, a Scientific Workflow Language for Abstracted Parallelisation 983
D. Goodman (Oxford University Computing Laboratory)
Semi-Automated Adaptation of Service Interactions 993
H. R. Motahari Nezhad, B. Benatallah (University of New South Wales),
A. Martens, F. Curbera (IBM T.J. Watson Research Center),
F. Casati (University of Trento)
Session: SLAs and QoS
Reliable QoS Monitoring Based on Client Feedback 1003
R. Jurca (Ecole Polytechnique Fédérale de Lausanne),
W. Binder (University of Lugano),
B. Faltings (Ecole Polytechnique Fédérale de Lausanne)
Preference-based Selection of Highly Configurable Web Services 1013
S. Lamparter, A. Ankolekar, R. Studer (University of Karlsruhe),
S. Grimm (FZI Research Center for Information Technologies)
Speeding up Adaptation of Web Service Compositions Using Expiration Times 1023
J. Harney, P. Doshi (University of Georgia)
DIANE - An Integrated Approach to Automated Service Discovery, Matchmaking and Composition 1033
U. Küster, B. König-Ries (Friedrich-Schiller University Jena),
M. Stern, M. Klein (University of Karlsruhe)
Track: XML and Web Data
Session: Querying & Transforming XML
Multiway SLCA-based Keyword Search in XML Data 1043
C. Sun, C.-Y. Chan, A. K. Goenka (National University of Singapore)
Visibly Pushdown Automata for Streaming XML 1053
V. Kumar, P. Madhusudan, M. Viswanathan (University of Illinois at Urbana-Champaign)
Mapping-Driven XML Transformation 1063
H. Jiang, H. Ho, L. Popa (IBM Almaden Research Center),
W.-S. Han (Kyungpook National University),
Session: Parsing, Normalizing, & Storing XML
Querying and Maintaining a Compact XML Storage 1073
R. K. Wong, F. Lam, W. M. Shui (University of New South Wales & Green Pea Software)
XML Design for Relational Storage 1083
S. Kolahi (University of Toronto),
L. Libkin (University of Edinburgh)
A High-Performance Interpretive Approach to Schema-Directed Parsing 1093
M. Matsa, E. Perkins, A. Heifets, M. Gaitatzes Kostoulas, D. Silva, N. Mendelsohn, M. Leger (IBM Corporation)
POSTERS
Topic: Developing Regions
Collaborative ICT for Indian Business Clusters 1115
S. Roy, S. Biswas (Motorola India Research Laboratories)
Delay Tolerant Applications for Low Bandwidth and Intermittently Connected Users: the aAQUA Experience 1117
S. Sahni, K. Ramamritham (Indian Institute of Technology Bombay)
Topic: Search
A Cautious Surfer for PageRank 1119
L. Nie, B. Wu, B. D. Davison (Lehigh University)
A Clustering Method For Web Data with Multi-Type Interrelated Components 1121
L. Bolelli, S. Ertekin, D. Zhou, C. L. Giles (The Pennsylvania State University),
A Large-Scale Study of Robots.txt 1123
Y. Sun, Z. Zhuang, C. L. Giles (The Pennsylvania State University)
A Link-Based Ranking Scheme for Focused Search 1125
T. Abou-Assaleh, Y. Miao, T. Das, P. O'Brien ,W. Gao , Z. Zhen (GenieKnows.com)
A Link Classification Based Approach to Website Topic Hierarchy Generation 1127
N. Liu, C. C. Yang (The Chinese University of Hong Kong)
A Search-based Chinese Word Segmentation Method 1129
X.-J. Wang (IBM China Research Center),
W. Liu (Huazhong University of Science & Technology),
Y. Qin (IBM China Research Center)
Anchor-based Proximity Measures 1131
A. Joshi, R. Kumar, B. Reed, A. Tomkins (Yahoo! Research)
Automatic Search Engine Performance Evaluation with Click-through Data Analysis 1133
Y. Liu, Y. Fu, M. Zhang, S. Ma (Tsinghua University),
L. Ru (Sohu Incorporation)
Automatic Searching of Tables in Digital Libraries 1135
Y. Liu, K. Bai, P. Mitra, C. L. Giles (The Pennsylvania State University)
Bayesian Network based Sentence Retrieval Model 1137
K. Cai, J. Bu, C. Chen, K. Liu, W. Chen (Zhejiang University)
Brand Awareness and the Evaluation of Search Results 1139
B. J. Jansen, M. Zhang, Y. Zhang (The Pennsylvania State University)
Causal Relation of Queries from Temporal Logs 1141
Y. Sun (Peking University),
N. Liu (Microsoft Research Asia),
K. Xie (Peking University),
S. Yan (University of Illinois at Urbana-Champaign),
B. Zhang, Z. Chen (Microsoft Research Asia)
Classifying Web Sites 1143
C. Lindemann, L. Littig (University of Leipzig)
Comparing Apples and Oranges: Normalized PageRank for Evolving Graphs 1145
K. Berberich, S. Bedathur, G. Weikum (Max-Planck Institute for Informatics),
M. Vazirgiannis (INRIA/FUTURS)
Designing Efficient Sampling Techniques to Detect Webpage Updates 1147
Q. Tan, Z. Zhuang, P. Mitra, C. L. Giles (The Pennsylvania State University)
Determining the User Intent of Web Search Engine Queries 1149
B. J. Jansen, D. L. Booth (The Pennsylvania State University),
A. Spink (Queensland University of Technology)
EPCI: Extracting Potentially Copyright Infringement Texts from the Web 1151
T. Tashiro, T. Ueda, T. Hori, Y. Hirate, H. Yamana (Waseda University & National Institute of Informatics)
Efficient Training on Biased Minimax Probability Machine for Imbalanced Text Classification 1153
X. Peng, I. King (The Chinese University of Hong Kong)
Electoral Search Using the VerkiezingsKijker: An Experience Report 1155
V. Jijkoun, M. Marx, M. de Rijke, F. van Waveren (University of Amsterdam)
Exploration of Query Context for Information Retrieval 1157
K. Cai, C. Chen, J. Bu, P. Huang, Z. Kang (Zhejiang University)
First-order Focused Crawling 1159
Q. Xu, W. Zuo (Jilin University)
Academic Web Search Engine — Generating a Survey Automatically 1161
Y. Wang, Z. Geng, S. Huang, X. Wang, A. Zhou (Fudan University)
Generative Models for Name Disambiguation 1163
Y. Song, J. Huang, I. G. Councill, J. Li, C. L. Giles (The Pennsylvania State University)
GigaHash: Scalable Minimal Perfect Hashing for Billions of URLs 1165
K. Chellapilla, A. Mityagin, D. Charles (Microsoft Live Laboratories)
How NAGA Uncoils: Searching with Entities and Relations 1167
G. Kasneci, F. M. Suchanek, M. Ramanath, G. Weikum (Max-Planck-Institut)
Identifying Ambiguous Queries in Web Search 1169
R. Song (Shanghai Jiao Tong University & Microsoft Research Asia),
Z. Luo (Fudan University),
J.-R. Wen (Microsoft Research Asia),
Y. Yu (Shanghai Jiao Tong University),
H.-W. Hong (Microsoft Research Asia)
Web Page Classification with Heterogeneous Data Fusion 1171
Z. Xu, I. King, M. R. Lyu (The Chinese University of Hong Kong)
Learning Information Diffusion Process on the Web 1173
X. Wan, J. Yang (Peking University)
MedSearch: A Specialized Search Engine for Medical Information 1175
G. Luo, C. Tang, H. Yang (IBM T.J. Watson Research Center),
X. Wei (University of Massachusetts at Amherst)
Mining Contiguous Sequential Patterns from Web Logs 1177
J. Chen (Queens College, CUNY),
T. Cook (City University of New York)
Monitoring the Evolution of Cached Content in Google and MSN 1179
I. Anagnostopoulos (University of the Aegean)
Multi-factor Clustering for a Marketplace Search Interface 1181
N. Sundaresan, K. Ganesan, R. Grandhi (eBay Research Laboratories),
On Ranking Techniques for Desktop Search 1183
S. Cohen, C. Domshlak, N. Zwerdling (Technion—Israel Institute of Technology)
Query-Driven Indexing for Peer-to-Peer Text Retrieval 1185
G. Skobeltsyn, T. Luu (Ecole Polytechnique Fédérale de Lausanne),
I. P. Žarko (University of Zagreb),
M. Rajman, K. Aberer (Ecole Polytechnique Fédérale de Lausanne)
Query Topic Detection for Reformulation 1187
X. He (Peking University),
J. Yan (Microsoft Research Asia),
J. Ma (Peking University),
N. Liu, Z. Chen (Microsoft Research Asia)
Review Spam Detection 1189
N. Jindal, B. Liu (University of Illinois at Chicago)
SCAN: A Small-World Structured P2P Overlay for Multi-Dimensional Queries 1191
X. Sun (Graduate School of Chinese Academy of Sciences)
SRing: A Structured Non DHT P2P Overlay Supporting String Range Queries 1193
X. Sun, X. Chen (Graduate School of Chinese Academy of Sciences)
Search Engine Retrieval of Changing Information 1195
Y. S. Kim, B. H. Kang (University of Tasmania),
P. Compton (The University of New South Wales),
H. Motoda (Osaka University)
Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized? 1197
F. McCown, M. L. Nelson (Old Dominion University)
Spam and Popularity Ratings for Combating Link Spam 1199
M. Dalal (LDI)
Summary Attributes and Perceived Search Quality 1201
D. E. Rose (A9.com Inc.),
D. Orr, R. G. P. Kantamneni (Yahoo! Inc.)
Tag Clouds for Summarizing Web Search Results 1203
B. Y.-L. Kuo (The University of British Columbia),
T. Hentrich (The University of British Columbia & Simon Fraser University),
B. M. Good, M. D. Wilkinson (The University of British Columbia)
Towards Efficient Dominant Relationship Exploration of the Product Items on the Web 1205
Z. Yang, L. Li, B. Wang, M. Kitsuregawa (University of Tokyo)
Understanding Web Search via a Learning Paradigm 1207
B. J. Jansen, B. Smith, D. L. Booth (The Pennsylvania State University)
Using d-gap Patterns for Index Compression 1209
J. Chen (Queens College, CUNY),
T. Cook (City University of New York)
Utility Analysis for Topically Biased PageRank 1211
C. Kohlschütter, P.-A. Chirita, W. Nejdl (L3S/University of Hannover)
Sliding Window Technique for the Web Log Analysis 1213
N. Buzikashvili (Russian Academy of Science)
A Password Stretching Method using User Specific Salts 1215
C. Lee (INITECH), H. Lee (Korea University)
Simple Authentication for the Web 1217
T. W. van der Horst, K. E. Seamons (Brigham Young University)
Topic: Semantic Web
A Management and Performance Framework for Semantic Web Servers 1219
M. Mesarina, V. K. Srinivasmurthy, N. Lyons, C. Sayers (Hewlett-Packard)
A Probabilistic Semantic Approach for Discovering Web Services 1221
J. Ma (Victoria University), J. Cao (La Trobe University), Y. Zhang (Victoria University)
Acquiring Ontological Knowledge from Query Logs 1223
S. Sekine (New York University), H. Suzuki (Microsoft Research)
Altering Document Term Vectors for Classification - Ontologies as Expectations of Co-occurrence 1225
M. Nagarajan, A. Sheth (Wright State University),
M. Aguilera, K. Keeton, A. Merchant, M. Uysal (Hewlett-Packard Laboratories)
Building and Managing Personalized Semantic Portals 1227
M. Şah, W. Hall (University of Southampton)
Deriving Knowledge from Figures for Digital Libraries 1229
X. Lu, J. Z. Wang, P. Mitra, C. L. Giles (The Pennsylvania State University)
Development of a Semantic Web Based Mobile Local Search System 1231
J.-S. Jeon, G.-J. Lee (KTF R&D Group)
Estimating the Cardinality of RDF Graph Patterns 1233
A. Maduko, K. Anyanwu (University of Georgia),
A. Sheth (Wright State University),
P. Schliekelman (University of Georgia)
Extending WebML towards Semantic Web 1235
F. M. Facca, M. Brambilla (Politecnico di Milano)
Image Annotation by Hierarchical Mapping of Features 1237
Q. Zhao, P. Mitra, C. L. Giles (The Pennsylvania State University)
Integrating Web Directories by Learning their Structures 1239
C. C. Yang, J. Lin (The Chinese University of Hong Kong)
Learning Ontologies to Improve the Quality of Automatic Web Service Matching 1241
H. Guo (Stony Brook University),
A. Ivan, R. Akkiraju, R. Goodwin (IBM T.J. Watson Research Center)
Ontology Engineering Using Volunteer Labor 1243
B. M. Good, M. D. Wilkinson (The University of British Columbia)
Semantic Personalization of Web Portal Contents 1245
C. Tziviskou, M. Brambilla (Politecnico di Milano)
The Largest Scholarly Semantic Network… Ever. 1247
J. Bollen, M. A. Rodriguez, H. Van de Sompel, L. L. Balakireva, A. Hagberg (Los Alamos National Laboratory)
Topic: Services
A Kernel based Structure Matching for Web Services Search 1249
J. Yu, S. Guo, H. Su, H. Zhang, K. Xu (Beihang University)
A Novel Collaborative Filtering-Based Framework for Personalized Services in M-Commerce 1251
Q. Li, C. Wang, G. Geng, R. Dai (Chinese Academy of Sciences)
Towards Service Pool Based Approach for Services Discovery and Subscription 1253
X. Liu, L. Zhou, G. Huang, H. Mei (Peking University)
Crawling Multiple UDDI Business Registries 1255
E. Al-Masri, Q. H. Mahmoud (University of Guelph)
Discovering the Best Web Service 1257
E. Al-Masri, Q. H. Mahmoud (University of Guelph)
Mobile Shopping Assistant: Integration of Mobile Applications and Web Services 1259
H. Wu, Y. Natchetoi (SAP Laboratories)
On Automated Composition for Web Services 1261
Z. Shen, J. Su (University of California at Santa Barbara)
Providing Session Management as Core Business Service 1263
I. Ari, J. Li, R. Ghosh, M. Dekhil (Hewlett-Packard Laboratories)
Towards Automating Regression Test Selection for Web Services 1265
M. Ruth, S. Tu (University of New Orleans)
Towards Environment Generated Media: Object-participation-type Weblog in Home Sensor Network 1267
T. Maekaw, Y. Yanagisawa, T. Okadome (NTT Communication Science Laboratories)
Topic: Social Networks
BlogScope: Spatio-temporal Analysis of the Blogosphere 1269
N. Bansal, N. Koudas (University of Toronto)
EOS: Expertise Oriented Search Using Social Networks 1271
J. Li, J. Tang, J. Zhang (Tsinghua University),
Q. Luo, Y. Liu (The Hong Kong University of Science & Technology),
M. Hong (Tsinghua University)
Exploring Social Dynamics in Online Media Sharing 1273
M. Halvey, M. T. Keane (University College Dublin)
Finding Community Structure in Mega-scale Social Networks [Extended Abstract] 1275
K. Wakita, T. Tsurumi (Tokyo Institute of Technology)
Life is Sharable: Mechanisms to Support and Sustain Blogging Life Experience 1277
Y.-M. Cheng (Tatung University),
T.-C. Chou (Academia Sinica),
W. Yu (Queen's University Belfast),
L.-C. Chen, C.-L. Yeh (Tatung University),
M.-C. Chen (Academia Sinica)
Measuring Credibility of Users in an E-learning Environment 1279
W. Wei (The Royal Institute of Technology),
J. Lee, I. King (The Chinese University of Hong Kong)
Modeling User Behavior in Recommender Systems based on Maximum Entropy 1281
T. Iwata, K. Saito, T. Yamada (NTT Communication Science Laboratories)
Parallel Crawling for Online Social Networks 1283
D. H. Chau, S. Pandit, S. Wang, C. Faloutsos (Carnegie Mellon University)
Personalized Social & Real-Time Collaborative Search 1285
M. Dalal (LDI)
Towards Extracting Flickr Tag Semantics 1287
T. Rattenbury, N. Good, M. Naaman (Yahoo! Research Berkeley)
Topic: Systems
A No-Frills Architecture for Lightweight Answer Retrieval 1289
M. Paşca (Google Inc.)
AutoPerf: An Automated Load Generator and Performance Measurement Tool for Multi-tier Software Systems 1291
S. Shirodkar, V. Apte (ITT Bombay)
Construction by Linking: The Linkbase Method 1293
J. Meinecke, F. Majer (University of Karlsruhe),
M. Gaedke (Chemnitz University of Technology),
Image Collector III: A Web Image-Gathering System with Bag-of-Keypoints 1295
K. Yanai (The University of Electro-Communications)
Mirror Site Maintenance Based on Evolution Associations of Web Directories 1297
L. Chen (L3S/University of Hannover),
S. Bhowmick (Nanyang Technological University),
W. Nejdl (L3S/University of Hannover)
On Building Graphs of Documents with Artificial Ants 1299
H. Azzag, J. Lavergne, G. Venturini (Laboratoire d'Informatique de I'Université de Tours),
C. Guinot (CE.R.I.E.S.)
Towards a Scalable Search and Query Engine for the Web 1301
A. Hogan, A. Harth, J. Umbrich, S. Decker (National University of Ireland)
Web4CE: Accessing Web-based Applications on Consumer Devices 1303
W. Dee, P. Shrubsole (Philips Research Laboratories)
Web Mashup Scripting Language 1305
M. Sabbouh, J. Higginson, S. Semy, D. Gagne (The MITRE Corporation)
Topic: User Interfaces & Accessibility
A Browser for a Public-Domain SpeechWeb 1307
R. A. Frost, X. Ma, Y. Shi (University of Windsor)
A Novel Clustering-based RSS Aggregator 1309
X. Li (Peking University),
J. Yan (Microsoft Research Asia),
Z. Deng (Peking University),
L. Ji (Microsoft Research Asia),
W. Fan (Virginia Polytechnic Institute & State University),
B. Zhang, Z. Chen (Microsoft Research Asia)
Adaptive Faceted Browser for Navigation in Open Information Spaces 1311
M. Tvarožek, M. Bieliková (Slovak University of Technology)
An Assessment of Tag Presentation Techniques 1313
M. Halvey, M. T. Keane (School of Computer Science & Informatics, UCD)
An Information State-Based Dialogue Manager for Making Voice Web Smarter 1315
M. Gatius, M. González, E. Comelles (Technical University of Catalonia)
Behavior Based Web Page Evaluation 1317
G. Velayathan, S. Yamada (National Institute of Informatics)
Generating Efficient Labels to Facilitate Web Accessibility 1319
L. Spalteholz, K. F. Li, N. Livingston (University of Victoria)
Generation, Documentation and Presentation of Mathematical Equations and Symbolic Scientific Expressions Using Pure HTML and CSS 1321
K. Alabi (State University of New York at Stony Brook)
GeoTV: Navigating Geocoded RSS to Create an IPTV Experience 1323
Y.-F. Chen, G. Di Fabbrizio, D. Gibbon, R. Jana, S. Jora, B. Renger, B. Wei (AT&T Laboratories – Research)
Summarization of Online Image Collections via Implicit Feedback 1325
S. Ahern, S. King, M. Naaman, R. Nair (Yahoo! Research Berkeley)
System for Reminding a User of Information Obtained through a Web Browsing Experience 1327
T. Morita, T. Hidaka, A. Tanaka, Y. Kato (NTT Corporation)
The ScratchPad: Sensemaking Support for the Web 1329
D. Gotz (IBM T.J. Watson Research Center)
Towards Multi-granularity Multi-facet E-Book Retrieval 1331
C. Huang (Chinese Academy of Sciences),
Y. Tian (Chinese Academy of Sciences & Peking University),
Z. Zhou (Chinese Academy of Sciences),
T. Huang (Chinese Academy of Sciences & Peking University)
Visualizing Structural Patterns in Web Collections 1333
M. S. Ali, M. P. Consens, F. Rizzolo (University of Toronto)
Topic: XML
Adaptive Record Extraction From Web Pages 1335
J. Park, D. Barbosa (University of Calgary)
Exploit Sequencing Views in Semantic Cache to Accelerate Xpath Query Evaluation 1337
J. Feng, N. Ta, Y. Zhang, G. Li (Tsinghua University)
Extensible Schema Documentation with XSLT 2.0 1339
F. Michel (ETH Zürich),
E. Wilde (University of California at Berkeley)
Preserving XML Queries during Schema Evolution 1341
M. M. Moro (University of California at Riverside),
S. Malaika (IBM Silicon Valley Laboratory),
L. Lim (IBM T.J. Watson Research Center)
SPath: A Path Language for XML Schema 1343
E. Wilde (University of Calfornia at Berkeley),
F. Michel (ETH Zürich)
The Use of XML to Express a Historical Knowledge Base 1345
K. T. Nakahira, M. Matsui, Y. Mikami (Nagaoka University of Technology)
U-REST: An Unsupervised Record Extraction SysTem 1347
Y. K. Shen, D. R. Karger (Massachusetts Institute of Technology)
XML-Based Multimodal Interaction Framework for Contact Center Applications 1349
N. Anisimov, B. Galvin, H. Ristock (Genesys Telecommunication Laboratories)
XML-Based XML Schema Access 1351
E. Wilde (University of California at Berkeley), F. Michel (ETH Zürich)
AUTHOR INDEX 1353