Accepted papers
Regular Paper Sessions
Best Paper Session (June 12th, 11:30-13:00)
[124] Goncalo Marcelino, Ricardo Pinto and Joao Magalhaes: Ranking News-Quality Multimedia
[150] Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury: Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
[160] Shizhe Chen, Jia Chen, Qin Jin and Alex Hauptmann: Class-aware Self-Attention for Audio Event Recognition
[ 90] Andrea Ceroni, Ma Chenyang and Ralph Ewerth: Mining Exoticism from Visual Content with Fusion-based Deep Neural Networks
Session 1: Multimedia Retrieval (June 12th, 16:30-18:30)
[ 36] Xing Xu, Jingkuan Song, Huimin Lu, Yang Yang, Fumin Shen and Zi Huang: Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval
[155] Kevin Joslyn, Kai Li and Kien Hua: Cross-Modal Retrieval Using Deep De-correlated Subspace Ranking Hashing
[ 13] Ge Song and Xiaoyang Tan: Learning multilevel semantic similarity for large-scale multi-label image retrieval
[154] Limeng Cui, Zhensong Chen, Jiawei Zhang, Philip S. Yu, Yong Shi and Lifang He: Multi-view Collective Tensor Decomposition for Cross-modal Hashing
[135] Lei Zhou, Xiao Bai, Xianglong Liu and Jun Zhou: Binary Coding by Matrix Classifier for Effective Subspace Retrieval
[170] Zhongyan Zhang, Lei Wang, Yang Wang, Luping Zhou, Jianjia Zhang and Fang Chen: Instance Image Retrieval by Aggregating Sample-based Discriminative Characteristics
Session 2: Multimedia Content Analysis (June 13th, 9:30-11:00)
[ 26] Wenjie Zhang, Junchi Yan, Xiangfeng Wang and Hongyuan Zha: Deep eXtreme Multi-label Learning
[ 44] Feiran Huang, Xiaoming Zhang, Chaozhuo Li, Zhonghua Zhao, Yueying He and Zhoujun Li: Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder
[148] Devanshu Arya and Marcel Worring: Exploiting Relational Information in Social Networks using Geometric Deep Learning on Hypergraphs
[141] Matthias Zeppelzauer, Miroslav Despotovic, Muntaha Sakeena, David Koch and Mario Doller: Automatic Prediction of Building Age from Photographs
[ 50] Kejun Zhang, Hui Zhang, Simeng Li, Changyuan Yang and Lingyun Sun: The PMEmo Dataset for Music Emotion Recognition
Session 3: Multimedia Applications (June 13th, 11:30-12:30)
[103] Zunlei Feng, Zhenyun Yu, Yezhou Yang, Yongcheng Jing, Junxiao Jiang and Mingli Song: Interpretable Partitioned Embedding for Customized Multi-item Fashion Outfit Composition
[ 86] Peirui Cheng and Weiqiang Wang: A Multi-Oriented Scene Text Detector with Position-Sensitive Segmentation
[113] Lan Wang, Yang Wang, Susu Shan and Feng Su: Scene Text Detection and Tracking in Video with Background Cues
Session 4: Video Analysis (June 13th, 17:30-18:30)
[ 67] Yang Mi, Kang Zheng and Song Wang: Recognizing Actions in Wearable-Camera Videos by Training Classifiers on Fixed-Camera Videos
[129] Romain Cohendet, Karthik Yadati, Ngoc Q. K. Duong and Claire-Helene Demarty: Annotating, understanding, and predicting long-term video memorability
[126] Daniel Rotman, Dror Porat, Gal Ashour and Udi Barzelay: Optimally Grouped Deep Features Using Normalized Cost for Video Scene Detection
Poster Papers (Poster presentation: June 13th, 14:00-16:00. Spotlight: June 13th, 12:30-13:00)
[ 8] Hanjiang Lai: Transductive Zero-Shot Hashing via Coarse-to-Fine Similarity Mining
[ 43] Xin Luo, Peng-Fei Zhang, Ye Wu, Zhen-Duo Chen, Hua-Junjie Huang and Xin-Shun Xu: Asymmetric Discrete Cross-Modal Hashing
[ 72] Xiang Zhang, Guohua Dong, Yimo Du, Chengkun Wu, Zhigang Luo and Canqun Yang: Collaborative Subspace Graph Hashing for Cross-modal Retrieval
[ 94] Ye Wu, Xin Luo, Xin-Shun Xu, Shanqing Guo and Yuliang Shi: Dictionary Learning based Supervised Discrete Hashing for Cross-Media Retrieval
[ 30] Bingqing Ke, Jie Shao, Zi Huang and Heng Tao Shen: Feature Reconstruction by Laplacian Eigenmaps for Efficient Instance Search
[149] Zachary Seymour and Zhongfei Zhang: Image Annotation Retrieval with Text-Domain Label Denoising
[147] Zachary Seymour and Zhongfei Zhang: Multi-label Triplet Embeddings for Image Annotation from User-Generated Tags
[111] Chandramani Chaudhary, Poonam Goyal, Joel R A Moniz, Navneet Goyal and Yi-Ping Phoebe Chen: Linguistic Patterns and Cross Modality-based Image Retrieval for Complex Queries
[101] Minh-Son Dao, Asem Kasem and Quang-Nhat-Minh Pham: A Context-Aware Late-Fusion Approach for Disaster Image Retrieval from Social Media
[ 52] Yugo Sato, Tsukasa Fukusato and Shigeo Morishima: Face Retrieval Framework Relying on User's Visual Memory
[161] Xueping Wang, Weixin Li, Guodong Mu, Di Huang and Yunhong Wang: Facial Expression Synthesis by U-Net Conditional Generative Adversarial Networks
[ 58] Hongzhi Li, Joseph Ellis, Lei Zhang and Shih-Fu Chang: PatternNet: Visual Pattern Mining with Deep Neural Network
[ 28] Mingjie Zheng, Sheng-Hua Zhong, Songtao Wu and Jianmin Jiang: Steganographer Detection based on Multiclass Dilated Residual Networks
[110] Maguell L.T.L. Sandifort, Jianquan Liu, Shoji Nishimura and Wolfgang Hurst: An Entropy Model for Loiterer Retrieval across Multiple Surveillance Cameras
[125] Philipp Harzig, Christian Eggert and Rainer Lienhart: Visual Question Answering With a Hybrid Convolution Recurrent Model
[134] Shuai Liao, Efstratios Gavves and Cees Snoek: Searching and Matching Texture-free 3D Shapes in Images
[ 61] Duc Tien Dang Nguyen, Michael Riegler, Liting Zhou and Cathal Gurrin: Challenges and Opportunities within Personal Life Archives
[138] Xu Sun, Yuantian Wang, Tongwei Ren, Zhi Liu and Gangshan Wu: Object Trajectory Proposal via Hierarchical Volume Grouping
[ 97] Sungeun Hong, Woobin Im and Hyun Seung Yang: CBVMR: Content-Based Videoˮusic Retrieval Using Soft Intra-Modal Structure Constraint
[117] Yi Tang, Zhi Jin, Wenbin Zou and Xia Li: Multi-Scale Spatiotemporal Conv-LSTM Network for Video Saliency Detection
[ 46] Jianfei Xue and Koji Eguchi: Supervised Nonparametric Multimodal Topic Modeling Methods for Multi-class Video Classification
[ 15] Baohan Xu, Hao Ye, Yingbin Zheng, Heng Wang, Tianyu Luwang and Yu-Gang Jiang: Dense Dilated Network for Few Shot Action Recognition
[ 23] Haonan Qiu, Yingbin Zheng, Hao Ye, Yao Lu, Feng Wang and Liang He: Precise Temporal Action Localization by Evolving Temporal Proposals
Special Sessions:
Special Session 1: Predicting User Perceptions of Multimedia Content
Oral (June 12th, 14:00-15:00)
[213] Dmitry Kuzovkin, Tania Pouli, Remi Cozot, Olivier Le Meur, Jonathan Kervec and Kadi Bouatouch: Image Selection in Photo Albums
[199] Yasemin Timar, Nihan Karslioglu, Heysem Kaya and Albert Ali Salah: Feature Selection and Multimodal Fusion for Estimating Emotions Evoked by Movie Clips
[212] Sarath Sivaprasad, Tanmayee Joshi, Rishabh Agrawal and Niranjan Pedanekar: Multimodal Continuous Prediction of Emotions in Movies using Long Short-Term Memory Networks
Poster (Poster presentation: June 12th, 15:30-16:30. Spotlight: June 12th, 14:00-15:00)
[179] Yang Liu, Zhonglei Gu, Tobey H. Ko and Kien A. Hua: Learning Perceptual Embeddings with Two Related Tasks for Joint Predictions of Media Interestingness and Emotions
[215] Jayneel Parekh, Harshvardhan Tibrewal and Sanjeel Parekh: Deep Pairwise Classification and Ranking for Predicting Media Interestingness
[197] Ivan Gonzalez Diaz, Jenny Benois-Pineau, Jean-Philippe Domenger and Aymar de Rugy: Perceptually-guided Understanding of Egocentric Video Content: Recognition of Objects to Grasp
[185] Wenlu Yang, Maria Rifqi, Christophe Marsala and Andrea Pinna: Towards Better Understanding of Player's Game Experience
Special Session 2: Social-Media Visual Summarization / Large-Scale 3D Multimedia Analysis and Applications
Oral (June 12th, 15:00-15:30
[219] Po-Yao Huang, Junwei Liang, Jean-Baptiste Lamare and Alexander Hauptmann: Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis
[220] Xiangyu Yue, Bichen Wu, Sanjit Seshia, Kurt Keutzer and Alberto Sangiovanni-Vincentelli: A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving
Poster (Poster presentation: June 12th, 15:30-16:30, Spotlight: June 12th, 15:00-15:30)
[ 76] Guoyu Lu and Jingkuan Song: 3D Image-based Indoor Localization Joint With WiFi Positioning
[204] Zhiwei Li and Lei Yu: Compare Stereo Patches Using Atrous Convolutional Neural Networks
Doctoral Symposium:
June 13th, 14:00-16:00)
[200] Wan-Lun Tsai: Personal Basketball Coach: Tactic Training through Wireless Virtual Reality
[183] Andreas Leibetseder: Extracting and Using Medical Expert Knowledge to Advance in Video Processing for Gynecologic Endoscopy
[186] Noa Garcia: Temporal Aggregation of Visual Features for Large-Scale Image-to-Video Retrieval
[207] Naoki Saito: Tourism Category Classification on Image Sharing Services Through Estimation of Existence of Reliable Results
[180] Rashmi Gupta: Considering Documents in Lifelog Information Retrieval?
Demo:
[ 45] Longhui Wei, Xiaobin Liu, Jianing Li and Shiliang Zhang: VP-ReID: Vehicle and Person Re-Identification System
[201] Maguell Sandifort, Jianquan Liu, Shoji Nishimura and Wolfgang Hürst: VisLoiter+: An Entropy Model-Based Loiterer Retrieval System with User-friendly Interfaces
[184] Wenjie Duan, Kengo Makino, Rui Ishiyama, Toru Takahashi, Yuta Kudo and Pieter Jonker: Automated Scanning and Individual Identification System for Parts without Marking or Tagging
[217] Nico Hezel and Kai Uwe Barthel: Dynamic construction and manipulation of hierarchical quartic image graphs
[187] Jonas Krause, Gavin Sugita, Kyungim Baek and Lipyeow Lim: WTPlant (What's That Plant?): a Deep Learning System for Identifying Plants in Natural Images
[181] Matthew Cooper, Jian Zhao, Chidansh Bhatt and David Shamma: MOOCex: Exploring Educational Video via Recommendation
[205] Yangbangyan Jiang, Qianqian Xu, Xiaochun Cao and Qingming Huang: Who to Ask: An Intelligent Fashion Consultant
[189] Chou Po-Wen, Lin Fu-Neng, Chang Keh-Ning and Chen Herng-Yow: A Simple Score Following System for Music Ensembles Using Chroma and Dynamic Time Warping