ACM SIGMM Rising Star Award 2019

The 2019 winner of the prestigious ACM Special Interest Group on Multimedia (SIGMM) Rising Star Award is Dr. Ting Yao. The award is given in recognition of his significant contributions in activity recognition and video captioning.
Dr. Ting Yao is currently a Principal Researcher in Vision and Multimedia Lab at JD AI Research, Beijing, China. His team is focusing on the research and innovation of large-scale multimedia search, video understanding, vision and language, and deep learning. Prior to joining in 2018, he was a Researcher with Microsoft Research Asia in Beijing, China. Dr. Yao is an active participant of several benchmark evaluations. He is the principal designer of the top-performing multimedia analytic systems in worldwide competitions such as COCO Image Captioning, Visual Domain Adaptation Challenge 2018 & 2017, and ActivityNet Large Scale Activity Recognition Challenge 2019~2016. He completed a Ph.D. in computer science (2014) at City University of Hong Kong, advised by Prof. Chong-Wah Ngo.
Dr. Yao is known for his PhD works on context mining for effective search of multimedia content. Without the right context for sensible way of leveraging data, mixing data of different natures can always result in uncertain performance. The PhD thesis by Ting is innovative in the senses that he looks into the problem of context understanding and considers multimedia search from three different perspectives. 1) Self – the principle way that multi-modality fusion, in the context of search re-ranking; 2) External – when and how the external knowledge could be transferred for a problem in different domain? 3) Crowdsourcing – exploiting user behavior as a “cheap” way of bridging semantic gap. His PhD thesis has contributed to the principled ways of mixing multimodal data and won the prestigious ACM SIGMM Outstanding PhD Thesis Award in year 2015.
Ting has been an active leader in exploring deep neural network design for video content recognition. He proposed the ground-breaking idea of “multi-granularity” in his ACM ICMR 2016 paper, which models a video by a hierarchical structure including a single frame, consecutive frames (motion), a short clip, and the entire video. The work significantly advanced the performance of video action recognition and was a Best Paper Finalist in ICMR 2016. Later in ICCV 2017 and more notably, he devised a new family of bottleneck building blocks that simulates 3D convolutions with 2D spatial convolutions plus 1D temporal convolutions, and developed to-date the deepest 199-layer Pseudo-3D Residual Net (P3D ResNet) for spatio-temporal representation learning. His team participated in ActivityNet Challenge 2019 & 2018 and demonstrated the top performance of P3D ResNet.
Ting was one of the first scholars to use deep learning for image/video captioning. He led the research effort and came up with a series of world-class research innovations in this endeavor. For example, he is the first who considered modeling both coherence and relevance in video description generation, and developed the best-performing video captioning system in the ActivityNet Large Scale Activity Recognition Challenge 2017. Examples of his pioneering also include the mining and leverage of high-level attributes and visual relationship for boosting encoder-decoder captioning models. More remarkably, he constructed and released the large-scale video to text dataset (MSR-VTT) to support the research in this area. The dataset has been widely used and already downloaded by more than 100 groups worldwide. Impact from such efforts cannot be underestimated, as clearly evidenced by the large number of citations to his works.
Dr. Yao is also a dedicated volunteer and holds lead positions in various conferences and journals. For example, he has been the area chair in ACM MM 2018, ICME 2019 & 2018, ICIP 2019, and a SPC in IJCAI 2019 and AAAI 2020. He served as an Associate Editor for Multimedia Systems and a Guest Editor for ACM TOMM. He was the lead organizer of MSR Video to Language Challenge in ACM Multimedia 2017 and 2016, and the co-organizer of AI Technology for Visual Fashion Computing workshop in ICME 2019 and Conceptual Captions Challenge in CVPR 2019.
ACM is the professional society of computer scientists, and SIGMM is the special interest group on multimedia.
2019 winner of SIGMM RSA.pdf247.75 KB