By Jialie Shen, Meng Wang, Shuicheng Yan, and Xian-Sheng Hua
Singapore Management University, Singapore
National University of Singapore, Singapore
Microsoft Research Asia, China
jlshen@smu.edu.sg
eric.mengwang@gmail.com
eleyans@nus.edu.sg
xshua@microsoft.com
Topic
As the size of online media collections grows rapidly, multimedia information retrieval (MIR) becomes an increasingly critical technique for effective multimedia document search and management. To facilitate the processes, it is essential to annotate the multimedia objects with comprehensive textual information. Consequently, multimedia tagging has been actively studied by many different communities (e.g., multimedia computing, information retrieval, machine learning and computer version) recently. Meanwhile, many commercial web systems (e.g., Youtube, del.icio.us, Last.fm and Flickr) have successfully applied tags and the related techniques to assist users in discovering, exploring and sharing media content in a convenient and flexible way.
The half-day tutorial comprehensively summarizes the research along this direction and provides a good balance between theoretical methodologies and real systems (including several industrial approaches). We plan to (i) introduce why tags and tagging schemes are important for accurate and scalable MIR; (ii) examine current commercial systems and research prototypes, focusing on comparing the advantages and the disadvantages of the various strategies and schemes for different types of media documents (e.g., image, video and audio); (iii) review key technical challenges in building tagging systems and explore how tagging techniques can be used to facilitate different kinds of retrieval tasks at large scale, and (iv) review a few promising research directions and explore potential solutions.
Outline
1. Introduction and Overview (25 mins)
1.1 What is multimedia tagging?
1.2 Why multimedia tagging?
1.3 Manual vs. Automatic Tagging
2. Image tagging (45 mins)
2.1 Learning-based tagging
2.1.1 Feature extraction scheme and image representation
2.1.2 Machine learning models specifically designed for image tagging
2.1.3 Applications on large scale image search
2.2 Search-based tagging
2.2.1 Scalable search; indexing/hashing
2.2.2 Research test-bed construction
3. Beyond image: music and video tagging (45 mins)
3.1 How music and video tagging is different from image tagging
3.1.1 Temporal media data modeling
3.1.2 Complexity of content
3.2 TRECVID experience
3.3 Beyond TRECVID: exploring structure information for music and video tagging
3.3.1 Advanced feature extraction and combination
3.3.2 Tagging model design
3.4 System benchmarking
4. Assistive multimedia tagging (45 mins)
4.1 Tagging with intelligent data organization & selection
4.2 Tag recommendation
4.3 Tag processing - refinement and information complementation
4.4 Personalization and applications on personal media information management
5. Summarization (20 mins)
5.1 Future trend of multimedia tagging and its applications
5.2 Open discussion
Brief Biography
Jialie Shen: Dr. Shen is an Assistant Professor in Information Systems and Lee Foundation Fellow, School of Information Systems, Singapore Management University (SMU), Singapore. Before moving to SMU, he worked as a faculty member at UNSW, Sydney and researcher at University of Glasgow for a few years. Dr. Shen’s main research interests include information retrieval, economic-aware media analysis, and multimedia systems. His recent work has been published or is forthcoming in leading journals and international conferences including ACM SIGIR, ACM Multimedia, ACM SIGMOD, CVPR, ICDE, WWW, IEEE TCSVT, IEEE TMM, Multimedia Systems, ACM TOIT and ACM TOIS. He also has actively served or is serving as session chairs, PC members and reviewers of a large number of leading international conferences.
Meng Wang: Dr. Wang is currently a research staff member in the National University of Singapore. Previously he worked as an associate researcher in Microsoft Research Asia and a research scientist in a start up in the Bay area. Dr. Wang's research interests include multimedia content analysis, tagging, search, and large-scale computing. Dr. Wang has authored about 80 technical papers in these areas. He is an associate editor of Information Sciences, an associate editor of Neurocomputing, and a guest editor of the special issues for Multimedia Systems Journal, Multimedia Tools and Applications, and Journal of Visual Communication and Image Representation. He received the Best Paper Award continuously from the ACM International Conference on Multimedia 2010 and 2009, and the Best Paper Award from the International Multimedia Modeling Conference 2010.
Shuicheng Yan: Dr. Yan is currently an Assistant Professor in the Department of Electrical and Computer Engineering at National University of Singapore, and the founding lead of the Learning and Vision Research Group (http://www.lv-nus.org). Dr. Yan's research areas include computer vision, multimedia and machine learning, and he has authored or co-authored over 200 technical papers over a wide range of research topics. He is an associate editor of IEEE Transactions on Circuits and Systems for Video Technology, and has been serving as the guest editor of the special issues for TMM and CVIU. He received the Best Paper Awards from ACM MM’10, ICME’10 and ICIMCS'09, the winner prize of the classification task in PASCAL VOC'10, the honorable mention prize of the detection task in PASCAL VOC'10, 2010 TCSVT Best Associate Editor (BAE) Award, and the co-author of the best student paper awards of PREMIA'09 and PREMIA'11.
Xian-Sheng Hua: Dr. Hua is now a Principal Research and Development Lead for Bing Multimedia Search with Microsoft. He is responsible for driving a team to design and deliver thought-leading media understanding and indexing features. Before joining Bing in 2011, Dr. Hua was a Lead Researcher with Microsoft Research Asia. During that time, his research interests are in the areas of multimedia search, advertising, understanding, and mining, as well as pattern recognition and machine learning. He has authored or co-authored more than 180 publications in these areas and has more than 60 filed patents or pending applications. Dr. Hua received the B.S. and Ph.D. degrees from Peking University, Beijing, China, in 1996 and 2001, respectively, both in applied mathematics. He serves as an Associate Editor of IEEE Transactions on Multimedia, Associate Editor of ACM Transactions on Intelligent Systems and Technology, Editorial Board Member of Advances in Multimedia and Multimedia Tools and Applications, and editor of Scholarpedia (Multimedia Category). Dr. Hua won the Best Paper Award and Best Demonstration Award in ACM Multimedia 2007, Best Poster Award in 2008 IEEE International Workshop on Multimedia Signal Processing, Best Student Paper Award in ACM Conference on Information and Knowledge Management 2009, and Best Paper Award in International Conference on MultiMedia Modeling 2010. He also won 2008 MIT Technology Review TR35 Young Innovator Award for his outstanding contributions to video search.