Keynotes
Keynote 1: Cortically-Coupled Computing for Media Retrievalby Paul Sajda, Professor of Biomedical Engineering, Columbia University
Keynote 2: Aggregating Local Image
Descriptors for Large-scale Image Retrieval and Classification
by Cordelia Schmid, INRIA Research Director, Head of the LEAR
project-team, Grenoble, France
Keynote 3: The Road to
Pervasive Multimedia Search and Multimodal Interaction
by Hsiao-Wuen Hon, Managing Director, Microsoft Research Asia,
Beijing, China
Paul Sajda Dept of Biomedical Engineering Columbia University New York, U.S.A. |
Abstract: Our visual systems are amazingly complex multimedia information processing machines. Using our brain's visual system we can recognize objects at a glance, under varying pose, illumination, and scale, and are able to rapidly learn and recognize new configurations of objects and exploit relevant context even in highly cluttered scenes. However our brains are subject to fatigue and have difficulty finding patterns in high-dimensional feature spaces that are often useful representations for multimedia data. In this talk I will describe our work in developing a synergistic integration of human visual processing and computer vision via a novel brain computer interface (BCI). Our approach, which we term cortically-coupled computer vision (C3Vision), uses non-invasively measured neural signatures from the electroencephalogram (EEG) that are indicative of user intent, interest and high-level, subjective and rapid reactions to visual and multimedia data. I will describe several system designs for C3Vision and current applications that are being developed for government and commercial applications.
Biography: Paul Sajda is Professor of Biomedical Engineering and Radiology at Columbia University and Director of the Laboratory for Intelligent Imaging and Neural Computing (LIINC). His research focuses on neural engineering, neuroimaging, computational neural modeling and machine learning applied to image understanding. Prior to Columbia he was Head of The Adaptive Image and Signal Processing Group at the David Sarnoff Research Center in Princeton, NJ. He received his B.S. in Electrical Engineering from MIT and his M.S. and Ph.D. in Bioengineering from the University ofPennsylvania. He is a recipient of the NSF CAREER Award, the Sarnoff Technical Achievement Award, and is a Fellow of the IEEE and the American Institute of Medical and Biological Engineering (AIMBE). He is also the Editor-in-Chief for the IEEE Transactions in Neural Systems and Rehabilitation Engineering and a member of the IEEE Technical Committee on Neuroengineering. He has been involved in several technology start-ups and is a co-Founder and Chairman of the Board of Neuromatters, LLC., a neurotechnology research and development company.
Cordelia Schmid INRIA Research Director Head of the LEAR project-team Grenoble, France |
Abstract: We address the problems of large scale image retrieval and classification. In both cases an appropriate image representation is important. We present and evaluate different ways of aggregating local image descriptors into a vector and show that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension.
Biography: Cordelia Schmid holds a M.S. degree in Computer
Science from the University of Karlsruhe and a Doctorate from the
Institut National Polytechnique de Grenoble (INPG). Her doctoral
thesis received the best thesis award from INPG in 1996. Dr. Schmid
was a post-doctoral research assistant in the Robotics Research
Group of Oxford University in 1996--1997. Since 1997 she has held a
permanent research position at INRIA Grenoble Rhone-Alpes, where she
is a research director and directs the INRIA team called LEAR for
LEArning and Recognition in Vision. Dr. Schmid is the author of over
a hundred technical publications. She has been an Associate Editor
for the IEEE Transactions on Pattern Analysis and Machine
Intelligence (2001--2005) and for the International Journal of
Computer Vision (2004---), and she has been program chair of the
2005 IEEE Conference on Computer Vision and Pattern Recognition and
of the 2012 European Conference on Computer Vision. In 2006, she was
awarded the Longuet-Higgins prize for fundamental contributions in
computer vision that have withstood the test of time. She is a
fellow of IEEE.
Hsiao-Wuen Hon Managing Director Microsoft Research Asia Beijing, China |
Title: The Road to Pervasive Multimedia Search and Multimodal
Interaction
Abstract: Thanks to the tremendous progress in multimedia, and natural
user interface (voice, vision, touch, pen, etc.) technologies, we
are entering a new era with pervasive multimedia and multimodal
experience. This experience realized by a diverse array of devices,
including mobile phones, PC's and TV's plus persistent cloud
services will have revolutionary impact in people consume all
contents and information. Many of our long await scenarios in
artificial intelligence and "information at your fingertips" will be
fulfilled. Technologies that enable general public accumulate and
disseminate of human knowledge in multimedia form with natural
multimodal user interface will be critical to improve life and
well-being throughout the world. In this talk, I would like to use
some recent technological advances in related areas to illustrated
the excitements and opportunities in front of us. At the same time,
At the same time, most upcoming technical challenges call for
multi-disciplinary innovation beyond the current makeup. Thus, I
will advocate how the community can exploit multi-disciplinary
innovation fully to lead this mission.
Biography: Hsiao-Wuen Hon is the Managing Director of
Microsoft Research Asia, located in Beijing, China. Founded in
1998, Microsoft Research Asia has since become one of the best
research centers in the world that MIT Technology Review called “the
hottest computer science research lab in the world.” Dr. Hon
oversees the lab’s research activities and collaborations with
academia in Asia Pacific.
An IEEE fellow and a Distinguished Scientist of Microsoft, Dr. Hon
is an internationally recognized expert in speech technology. He
serves on the editorial board of the international journal of the
Communication of the ACM. Dr. Hon has published more than 100
technical papers in international journals and at conferences. He
co-authored a book, Spoken Language Processing, which is a
graduate-level textbook and reference book in the area of speech
technology in many universities all over the world. Dr. Hon holds
three dozens of patents in several technical areas.
Dr. Hon has been with Microsoft since 1995. He joined Microsoft
Research Asia in 2004 as a Deputy Managing Director, responsible for
research in Internet search, speech & natural language, system,
wireless and networking. In addition, he founded and managed search
technology center (STC) from 2005 to 2007, the Microsoft internet
Search product (Bing) development in Asia Pacific.
Prior to joining Microsoft Research Asia, Dr. Hon was the founding
member and architect in Natural Interactive Services Division at
Microsoft Corporation. Besides overseeing all architectural and
technical aspects of the award winning Microsoft® Speech Server
product (Frost
& Sullivan's 2005 Enterprise Infrastructure Product of the Year
Award, Speech Technology Magazine’s 2004 Most Innovative
Solutions Awards and VSLive! 2004 Editors Choice Award.), Natural
User Interface Platform and Microsoft Assistance Platform, he is
also responsible for managing and delivering statistical learning
technologies and advanced search. Dr. Hon joined Microsoft Research
as a senior researcher at 1995 and has been a key contributor of
Microsoft's SAPI and speech engine technologies. He previously
worked at Apple Computer, where he led research and development for
Apple's Chinese Dictation Kit.
Dr. Hon received Ph.D in Computer Science from Carnegie Mellon
University and B.S. in Electrical Engineering from National Taiwan
University.
ACM International Conference on Multimedia Retrieval, Jun. 5 - 8, 2012