MMSports '20: Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports

MMSports '20: Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports

MMSports '20: Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports

Full Citation in the ACM Digital Library

SESSION: Session 1: Soccer

Session details: Session 1: Soccer

  • Rainer Lienhart

SoccerDB: A Large-Scale Database for Comprehensive Video Understanding

  • Yudong Jiang
  • Kaixu Cui
  • Leilei Chen
  • Canjin Wang
  • Changliang Xu

Soccer videos can serve as a perfect research object for video understanding because
soccer games are played under well-defined rules while complex and intriguing enough
for researchers to study. In this paper, we propose a new soccer video database named
SoccerDB, comprising 171,191 video segments from 346 high-quality soccer games. The
database contains 702,096 bounding boxes, 37,709 essential event labels with time
boundary, and 17,115 highlight annotations for object detection, action recognition,
temporal action localization, and highlight detection tasks. To our knowledge, it
is the largest database for comprehensive sports video understanding on various aspects.
We further survey a collection of strong baselines on SoccerDB, which have demonstrated
state-of-the-art performances on independent tasks. Our evaluation suggests that we
can benefit significantly when jointly considering the inner correlations among those
tasks. We believe the release of SoccerDB will tremendously advance researches around
comprehensive video understanding. Our dataset and code published on

Self-Supervised Small Soccer Player Detection and Tracking

  • Samuel Hurault
  • Coloma Ballester
  • Gloria Haro

In a soccer game, the information provided by detecting and tracking brings crucial
clues to further analyze and understand some tactical aspects of the game, including
individual and team actions. State-of-the-art tracking algorithms achieve impressive
results in scenarios on which they have been trained for, but they fail in challenging
ones such as soccer games. This is frequently due to the player small relative size
and the similar appearance among players of the same team. Although a straightforward
solution would be to retrain these models by using a more specific dataset, the lack
of such publicly available annotated datasets entails searching for other effective
solutions. In this work, we propose a self-supervised pipeline which is able to detect
and track low-resolution soccer players under different recording conditions without
any need of ground-truth data. Extensive quantitative and qualitative experimental
results are presented evaluating its performance. We also present a comparison to
several state-of-the-art methods showing that both the proposed detector and the proposed
tracker achieve top-tier results, in particular in the presence of small players.
Code available at "".

A Dataset & Methodology for Computer Vision based Offside Detection in Soccer

  • Neeraj Panse
  • Ameya Mahabaleshwarkar

Offside decisions are an integral part of every soccer game. In recent times, decision-making
in soccer games, including offside decisions, has been heavily influenced by technology.
However, in spite of the use of a Video Assistant Referee (VAR), offside decisions
remain to be plagued with inconsistencies. The two major points of criticism for the
VAR have been extensive delays in providing final decisions and inaccurate decisions
arising from human errors. The visual nature of offside decision-making makes Computer
Vision techniques a viable option for tackling these issues, by automating appropriate
aspects of the process. However, the lack of a computational algorithm that captures
all aspects of the offside rule, lack of an established methodology to computationally
represent soccer match scenes in a way that can be utilized by such an algorithm,
and the absence of a diverse, comprehensive dataset for testing these methods have
stood in the way of research efforts for this problem. This paper precisely addresses
each one of these obstacles, in an effort to facilitate further research in this area.
The paper presents a computational offside decision algorithm for soccer match images.
The methodology for creating a quantitative representation of soccer match images
for this offside algorithm has also been presented as a pipeline of Computer Vision
tasks. A novel dataset for evaluating this methodology has been presented, which contains
a curated selection of soccer match scenes that represent the various challenges that
can be faced by a system that aims to aid or automate the task of making offside decisions.
Finally, this paper also details the performance of a specific set of Computer Vision
tasks used in the presented pipeline, on the given dataset. The proposed system achieves
an F1 score of 0.85 on the dataset. The drawbacks and areas of improvements for these
methods have also been discussed in an attempt to focus future research on this task.
The presented dataset and pipeline implementation code is available at:

SESSION: Session 2: Novel MM Analysis Approaches in Sports

Session details: Session 2: Novel MM Analysis Approaches in Sports

  • Moritz Einfalt

Asking Graphs "How Did I Play?" Generating Graphs through Images Via Signals

  • Sai Siddartha Maram
  • Arjav Jain

Cricket is a game that requires players to constantly adapt to situations and customize
their game depending on opponents and playing conditions. Players and coaching staff
often watch video clips to understand the strategies of opponents. Iterating through
multiple matches over many years across various leagues and formats, and extracting
clips is a tiring process. In this paper, we propose a computer vision framework to
segment cricket matches into clips based on context and construct real-time graphs
using meta-data from segmented clips. We discuss various queries on the generated
graphs and also evaluate our segmentation and querying model based on the accuracy
and quality of the retrieved data.

HFNet: A Novel Model for Human Focused Sports Action Recognition

  • Lianyu Hu
  • Lin Feng
  • Shenglan Liu

Action recognition has attracted much attention recently and progressed remarkably.
However, as a special kind of actions, sports action recognition is more difficult
and deserves more attention. Our goal in this paper is to distinguish fine-grained
human-focused sport actions. Sport actions can always be decomposed into sub-actions
by body parts and it's necessary to establish the relationships among body parts and
combine them together to perform classification. Besides, sport actions are usually
fine-grained and subclasses a re similar which are hard to distinguish. Another tough
problem in practice is to locate the actor in complicated circumstances. However,
current methods in action recognition always pay attention to the whole image, thus
failing to capture details and constructing relationships in images. In this paper,
we propose a novel model to construct visual relationships in images through graph
convolutions. We make use of patches cropped around body joints as input for graph
nodes. Thus our model is able to pay attention to the changes and details of body
parts. Then, we carefully design model to learn connections among graph nodes adaptively
and empirically. We also provide another method to construct visual relationships
for graph nodes. By specially focusing on relationships and details, our model achieves
start-of-the-art performance on complex human-focused sports datasets FSD-10 and Diving48.

High-Level Tactical Performance Analysis with SportSense

  • Philipp Seidenschwarz
  • Martin Rumo
  • Lukas Probst
  • Heiko Schuldt

Team sports like football have become an important economic factor. As a result, the
pressure on coaches to succeed is increasing and, as a consequence, so are the expectations
of the performance analysts who support the coaches in their work. Until now, performance
analysis has been a mostly manual and time-consuming activity, mainly consisting of
video analysis. Only since the advent of new analysis tools, these analysts have experienced
support for their work. However, most existing tools are mostly limited to simple
analyzes and do not support complex tactical patterns. In this paper, we present how
the existing SportSense system has been extended by support for dedicated tactical
patterns, especially phases, continuous states, and profiles. SportSense has already
been a powerful tool to assist performance analysts in running quantitative and qualitative
analyzes. The tactical patterns that have been added even better support analysts
in their complex tasks, which is shown in the user studies we have conducted.

Video and Sensor-Based Rope Pulling Detection in Sport Climbing

  • Iustina Ivanova
  • Marina Andrić
  • Sadaf Moaveninejad
  • Andrea Janes
  • Francesco Ricci

Sport climbing is becoming an increasingly popular competitive sport as well as a
recreational activity. For this reason, indoor sport climbing operators are constantly
trying to improve their services and optimally use their infrastructure. One way to
support such a task is to track the climbing activities performed by visitors while
climbing. This paper considers a scenario in which a sensor is attached to a piece
of climbing equipment that connects the climbing rope to the bolt anchors (quickdraws)
and a camera is overlooking a climbing wall. Within this scenario, this paper explores
two approaches to detect when a climber finishes a climb and pulls the rope from the
wall: 1) a hybrid approach in which sensors and cameras are used and 2) a video-based
approach where only cameras are used. The evaluation resulted in recognition precision
of 91% for the hybrid and 76% for the video-based approach, respectively. This paper
also discusses advantages and disadvantages of analysed approaches and points out
future research directions to allow the automatic tracking of climbing activities.