ACM Multimedia 2016 papers on the web

This page is maintained by Yusuke Matsui. If you have additions or changes, please send an e-mail (matsui(at)

Best paper

DASH2: Exploring HTTP/2 for Internet Streaming to Mobile Device
Mengbai Xiao (George Mason University)
Viswanathan Swaminathan (George Mason University)
Sheng Wei (George Mason University)
Songqing Chen (George Mason University)
Deep-based Ingredient Recognition for Cooking Recipe Retrieval (pdf)
Jingjing Chen (City University of Hong Kong)
Chong-Wah Ngo (City University of Hong Kong)
Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis
Shengsheng Qian (Chinese Academy of Sciences)
Tianzhu Zhang (Chinese Academy of Sciences)
Changsheng Xu (Chinese Academy of Sciences)
Patterns of Free-form Curation: Visual Thinking with Web Content (pdf)
Nic Lupfer (Texas A&M University)
Andruid Kerne (Texas A&M University)
Andrew Webb (Texas A&M University)
Rhema Linder (Texas A&M University)
Yin Qu (Texas A&M University)
Alyssa Valdez (Texas A&M University)

Analysis & Search

Event Specific Multimodal Pattern Mining for Knowledge Base Construction
Hongzhi Li (Columbia University)
Joseph Ellis (Columbia University)
Heng Ji (Rensselaer Polytechnic Institute)
Shih-Fu Chang (Columbia University)
Joint Graph Learning and Video Segmentation Via Multiple Cues and Topology Calibration
Jingkuan Song (University of Trento)
Lianli Gao (University of Electronic Science and Technology of China)
Mihai Marian Puscas (University of Trento)
Feiping Nie (University of Texas at Arlington)
Fumin Shen (University of Electronic Science and Technology of China)
Nicu Sebe (University of Trento)
Parsimonious Mixed-Effects HodgeRank for Crowdsourced Preference Aggregation (project)
Qianqian Xu (Chinese Academy of Sciences, Peking University)
Jiechao Xiong (Peking University)
Xiaochun Cao (Chinese Academy of Sciences)
Yuan Yao (Peking University)
Weighted Linear Fusion of Multimodal Data – A Reasonable Baseline?
Ognjen Arandjelović (University of St Andrews)

Topics in Multimedia I

Deep CTR Prediction in Display Advertising (pdf)
Junxuan Chen (Shanghai Jiao Tong University, Alibaba Group)
Baigui Sun (Alibaba Group)
Hao Li (Alibaba Group)
Hongtao Lu (Shanghai Jiao Tong University)
Xian-Sheng Hua (Alibaba Group)
Multi-Stream Multi-Class Fusion of Deep Networks for Video Classification (pdf)
Zuxuan Wu (Fudan University)
Yu-Gang Jiang (Fudan University)
Xi Wang (Fudan University)
Hao Ye (Fudan University)
Xiangyang Xue (Fudan University)
Play and Rewind: Optimizing Video Binary Representations by Self-Supervised Temporal Hashing (pdf)
Hanwang Zhang (National University of Singapore)
Meng Wang (Hefei University of Technology)
Richang Hong (Hefei University of Technology)
Tat-Seng Chua (National University of Singapore)
QoE Prediction for Enriched Assessment of Individual Video Viewing Experience
Yi Zhu (TU Delft)
Alan Hanjalic (TU Delft)
Judith Redi (TU Delft)

Video Analysis & Streaming

DRIVING: Distributed Scheduling for Video Streaming in Vehicular Wi-Fi Systems (pdf)
Xi Chen (McGill University)
Lei Rao (General Motors)
Qiao Xiang (Yale University)
Xue Liu (McGill University)
Fan Bai (General Motors)
Dynamic Resource Provisioning with QoS Guarantee for Video Transcoding in Online Video Sharing Service
Guanyu Gao (National Taiwan University)
Yonggang Wen (Nanyang Technological University)
Cedric Westphal (Huawei)
High-speed Depth Stream Generation from A Hybrid Camera
Xinxin Zuo (University of Kentucky)
Sen Wang (University of Kentucky)
Jiangbin Zheng (Northwestern Polytechnical University)
Ruigang Yang (University of Kentucky)
Spatio-Temporal Analysis of Bandwidth Maps for Geo-Predictive Video Streaming in Mobile Environments
Bayan Taani (National University of Singapore)
Roger Zimmermann (National University of Singapore)

Deep Learning

Deep Cross Residual Learning for Multitask Visual Recognition (pdf)
Brendan Jou (Columbia University)
Shih-Fu Chang (Columbia University)
Image Captioning with Deep Bidirectional LSTMs (project)
Cheng Wang (University of Potsdam)
Haojin Yang (University of Potsdam)
Christian Bartz (University of Potsdam)
Christoph Meinel (University of Potsdam)
Multilayer and Multimodal Fusion of Deep Neutral Networks for Video Classification
Xiaodong Yang (NVIDIA)
Pavlo Molchanov (NVIDIA)
Jan Kautz (NVIDIA)
Robust Visual-Textual Sentiment Analysis: When Attention meets Tree-structured Recursive Neural Networks
Quanzeng You (University of Rochester)
Liangliang Cao (Yahoo Labs)
Hailin Jin (Adobe Research)
Jiebo Luo (University of Rochester)

Topics in Multimedia II

Leveraging Contextual Cues for Generating Basketball Highlights (pdf, project)
Vinay Bettadapura (Google)
Caroline Pantofaru (Google)
Irfan Essa (Georgia Institute of Technology)
Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model
Jingyuan Chen (National University of Singapore)
Xuemeng Song (National University of Singapore)
Liqiang Nie (National University of Singapore)
Xiang Wang (National University of Singapore)
Hanwang Zhang (National University of Singapore)
Tat-Seng Chua (National University of Singapore)
Server Allocation for Multiplayer Cloud Gaming
Yunhua Deng (Nanyang Technological University)
Yusen Li (Nanyang Technological University)
Xueyan Tang (Nanyang Technological University)
Wentong Cai (Nanyang Technological University)
Share-and-Chat: Achieving Human-Level Video Commenting by Search and Multi-View Embedding
Yehao Li (Sun Yat-sen University)
Ting Yao (Microsoft Research)
Tao Mei (Microsoft Research)
Hongyang Chao (Sun Yat-sen University)
Yong Rui (Microsoft Research)

Events and Context

Audio Event Detection using Weakly Labeled Data (pdf)
Anurag Kumar (Carnegie Mellon University)
Bhiksha Raj (Carnegie Mellon University)
Context-aware Image Tweets Modelling and Recommendation
Tao Chen (National University of Singapore)
Xiangnan He (National University of Singapore)
Min-Yen Kan (National University of Singapore)
Event localization in music auto-tagging
Jen-Yu Liu (Academia Sinica)
Yi-Hsuan Yang (Academia Sinica)
Semantic Image Profiling for Historic Events: Linking Images to Phrases
Jia Chen (Shanghai Jiao Tong University)
Qin Jin (Renmin University of China)
Yifan Xiong (Renmin University of China)

Topics in Multimedia III

Are Safer Looking Neighborhoods More Lively? A Multimodal Investigation into Urban Life (pdf)
Marco De Nadai (Fondazione Bruno Kessler, University of Trento)
Radu L. Vieriu (University of Trento)
Gloria Zen (University of Trento)
Stefan Dragicevic (Telecom Italia, University of Trento)
Nikhil Naik (Massachusetts Institute of Technology)
Michele Caraviello (Telecom Italia)
César A. Hidalgo (Massachusetts Institute of Technology)
Nicu Sebe (University of Trento)
Bruno Lepri (Fondazione Bruno Kessler)
Detecting Sarcasm in Multimodal Social Platforms (pdf)
Rossano Schifanella (University of Turin)
Paloma de Juan (Yahoo Labs)
Joel Tetreault (Yahoo Labs)
Liangliang Cao (Yahoo Labs)
User Redirection and Direct Haptics in Virtual Environments
Cristiano Carvalheiro (Universidade Nova de Lisboa)
Rui Nóbrega (Universidade Nova de Lisboa)
Hugo Machado (Universidade Nova de Lisboa)
Rui Rodrigues (Universidade Nova de Lisboa)
V3I-STAL: Visual Vehicle-to-Vehicle Interaction via Simultaneous Tracking and Localization
Xiaobai Liu (San Diego State University)
Xianfeng Yang (San Diego State University)

Learning & Hashing

Binary Optimized Hashing (pdf)
Qi Dai (Fudan University)
Jianguo Li (Intel Labs)
Jingdong Wang (Microsoft Research)
Yu-Gang Jiang (Fudan University)
Cross-batch Reference Learning for Deep Classification and Retrieval
Huei-Fang Yang (Academia Sinica)
Kevin Lin (Academia Sinica)
Chu-Song Chen (Academia Sinica)
Human Pose Estimation from Still Depth Image via Inference Embedded Multi-task Learning (pdf)
Keze Wang (Sun Yat-sen University, National University of Defense Technology)
Shengfu Zhai (Sun Yat-sen University, National University of Defense Technology)
Hui Cheng (Sun Yat-sen University, National University of Defense Technology)
Xiaodan Liang (Sun Yat-sen University, National University of Defense Technology)
Liang Lin (Sun Yat-sen University, National University of Defense Technology)
Linear Distance Preserving Pseudo-Supervised and Unsupervised Hashing
Min Wang (University of Science and Technology of China)
Wengang Zhou (University of Science and Technology of China)
Qi Tian (University of Texas at San Antonio)
Zhengjun Zha (University of Science and Technology of China)
Houqiang Li (University of Science and Technology of China)

Transport & Experience

A Perceptual Quality Metric for Videos Distorted by Spatially Correlated Noise
Chao Chen (Google)
Anil Kokaram (Google)
Mohammad Izadi (Google)
A Pragmatically Designed Adaptive and Web-compliant Object-based Video Streaming Methodology – Implementation and Subjective Evaluation
Maarten Wijnants (University of Hasselt)
Gustavo Rovelo (University of Hasselt)
Peter Quax (University of Hasselt)
Wim Lamotte (University of Hasselt)
SDNDASH: Improving QoE of HTTP Adaptive Streaming Using Software Defined Networking (pdf)
Abdelhak Bentaleb (National University of Singapore)
Ali C. Begen (Ozyegin University)
Roger Zimmermann (National University of Singapore)
Zero-Shot Hashing via Transferring Supervised Knowledge (pdf)
Yang Yang (University of Electronic Science and Technology of China)
Weilun Chen (University of Science and Technology of China)
Yadan Luo (University of Electronic Science and Technology of China)
Fumin Shen (University of Electronic Science and Technology of China)
Jie Shao (University of Science and Technology of China)
Heng Tao Shen (The University of Queensland)

Topics in Multimedia IV

Academic Coupled Dictionary Learning for Sketch Based Image Retrieval (pdf)
Dan Xu (University of Trento)
Xavier Alameda-Pineda (University of Trento)
Jingkuan Song (University of Trento)
Elisa Ricci (Fondazione Bruno Kessler, University of Perugia)
Nicu Sebe (University of Trento)
Key Color Generation for Affective Multimedia Production: An Initial Method and Its Application
EunJin Kim (KAIST)
Hyeon-Jeong Suk (KAIST)
Query Adaptive Instance Search using Object Sketches (pdf)
Sreyasee Das Bhattacharjee (Nanyang Technological University)
Junsong Yuan (Nanyang Technological University)
Weixiang Hong (Nanyang Technological University)
Xiang Ruan (Tiwaki)
Time Matters: Multi-scale Temporalization of Social Media Popularity
Bo Wu (Chinese Academy of Sciences)
Wen-Huang Cheng (Academia Sinica)
Yongdong Zhang (Chinese Academy of Sciences)
Tao Mei (Microsoft Research)

Analysis & Middleware

Affective Contextual Mobile Recommender System (pdf)
Chao Wu (Tsinghua University)
Jia Jia (Tsinghua University)
Wenwu Zhu (Tsinghua University)
Xu Chen (Sun Yat-sen University)
Bowen Yang (Tsinghua University)
Yaoxue Zhang (Tsinghua University, Central South University)
PL-ranking: A Novel Ranking Method for Cross-Modal Retrieval
Liang Zhang (Chinese Academy of Sciences)
Bingpeng Ma (Chinese Academy of Sciences)
Guorong Li (Chinese Academy of Sciences)
Qingming Huang (Chinese Academy of Sciences)
Qi Tian (University of Texas at San Antonio)
Transform Invariant Convolutional Neural Networks For Image Classification And Search
Xu Shen (University of Science and Technology of China)
Xinmei Tian (University of Science and Technology of China)
Anfeng He (University of Science and Technology of China)
Shaoyan Sun (University of Science and Technology of China)
Dacheng Tao (University of Technology Sydney)
Video eCommerce: Towards Online Video Advertising
Zhi-Qi Cheng (Southwest Jiaotong University)
Yang Liu (Alibaba Group)
Xiao Wu (Southwest Jiaotong University)
Xian-Sheng Hua (Alibaba Group)

Emotions, People and Faces

Ensemble of Sparse Cross-Modal Metrics for Heterogeneous Face Recognition
Jing Huo (Nanjing University)
Yang Gao (Nanjing University)
Yinghuan Shi (Nanjing University)
Wanqi Yang (Nanjing University)
Hujun Yin (The University of Manchester)
Predicting Personalized Emotion Perceptions of Social Images
Sicheng Zhao (Harbin Institute of Technology)
Hongxun Yao (Harbin Institute of Technology)
Wenlong Xie (Harbin Institute of Technology)
Xiaolei Jiang (Harbin Institute of Technology)
Yue Gao (National University of Singapore)
Rongrong Ji (Xiamen University)
Tat-Seng Chua (National University of Singapore)
Shorter-is-Better: Venue Category Estimation from Micro-Video (project)
Jianglong Zhang (Communication University of China)
Liqiang Nie (National University of Singapore)
Xiang Wang (National University of Singapore)
Xiangnan He (National University of Singapore)
Xianglin Huang (Communication University of China)
Tat-Seng Chua (National University of Singapore)
StressClick: Sensing Stress from Gaze-Click Patterns (video)
Michael Xuelin Huang (Hong Kong Polytechnic University)
Jiajia Li (Hong Kong Polytechnic University)
Grace Ngai (Hong Kong Polytechnic University)
Hong Va Leong (Hong Kong Polytechnic University)