MMVE '19- Proceedings of the 11th ACM Workshop on Immersive Mixed and Virtual Environment Systems


MMVE '19- Proceedings of the 11th ACM Workshop on Immersive Mixed and Virtual Environment Systems

Full Citation in the ACM Digital Library

On the first JND and break in presence of 360-degree content: an exploratory study

  •      Roberto G. de A. Azevedo
  • Neil Birkbeck
  • Ivan Janatra
  • Balu Adsumilli
  • Pascal Frossard

Unlike traditional planar 2D visual content, immersive 360-degree images and videos undergo particular processing steps and are intended to be consumed via head-mounted displays (HMDs). To get a deeper understanding on the perception of 360-degree visual distortions when consumed through HMDs, we perform an exploratory task-based subjective study in which we have asked subjects to define the first noticeable difference and break-in-presence points when incrementally adding specific compression artifacts. The results of our study: give insights on the range of allowed visual distortions for 360-degree content; show that the added visual distortions are more tolerable in mono than in stereoscopic 3D; and identify issues with current 360-degree objective quality metrics.

Immersive mixed reality object interaction for collaborative context-aware mobile training and exploration

  •      Jean Botev
  • Joe Mayer
  • Steffen Rothkugel

The recent generation of mobile devices brings the necessary processing power, sensorics and other hardware to enable highly interactive and immersive mixed reality experiences. However, existing applications mainly focus on single users or the individual experience, and are often limited to basic user-device interaction.

This paper discusses an immersive object interaction approach based on gestural input generated solely from the integrated camera of a mobile device. Developed within the context of the CollaTrEx framework for collaborative context-aware mobile training and exploration, it particularly allows for in-situ multiuser interaction with virtual objects. The presented approach is demonstrated in a set of prototypical game implementations for smartphones.

A quality of experience evaluation system and research challenges for networked virtual reality-based teleoperation applications

  •      David Concannon
  • Ronan Flynn
  • Niall Murray

Teleoperation applications are designed to assist humans in operating complex mechanical systems. Interfaces to teleoperation systems have always been challenging. Recently the potential of virtual reality (VR) has been a topic of interest, particularly with the availability of head mounted displays and interaction controller devices. As a result, research into the viability of VR as a technology to support remote operation and improved human machine interaction has emerged. It is assumed that VR will offer the user a more immersive and natural experience when operating a virtual representation of a mechanical system. To achieve this, there are a number of research challenges that need to be addressed. In this short paper, we introduce and discuss key challenges for VR-based teleoperation systems. Since the key focus of our work is understand user quality of experience (QoE) of VR-based teleoperation applications, the design and implementation of an implicit and explicit QoE Evaluation system is also presented.

Summarizing E-sports matches and tournaments: the example of counter-strike: global offensive

  •      Mathias Lux
  • Pål Halvorsen
  • Duc-Tien Dang-Nguyen
  • Håkon Stensland
  • Manoj Kesavulu
  • Martin Potthast
  • Michael Riegler

That video and computer games have reached the masses is a well known fact. Furthermore, game streaming and watching other people play video games is another phenomenon that has outgrown its small beginning by far, and game streams, be it live or recorded, are today viewed by millions. E-sports is the result of organized leagues and tournaments in which players can compete in controlled environments and viewers can experience the matches, discuss and criticize, just like in physical sports. However, as traditional sports, e-sports matches may be long and contain less interesting parts, introducing the challenge of producing well directed summaries and highlights. In this paper, we describe our efforts to approach the game streaming and e-sports phenomena from a multimedia research point of view. We focus on the challenge of summarizing matches from specific relevant game, Counter-Strike: Global Offensive (CS:GO). We survey related work, describe the rules and structure of the game and identify the main challenges for summarizing e-sports matches. With this contribution, we aim to foster multimedia research in the area of e-sports and game streaming.

Playing with delay: an interactive VR demonstration

  •      Kjetil Raaen
  • Ragnhild Eg
  • Ivar Kjellmo

Virtual reality is now used across a range of applications, from entertainment to clinical purposes. Although the rendered visualisations have better temporal and spatial resolutions than ever, several technological constraints remain - and people still suffer side-effects. With this demonstration, we address the temporal constraints of virtual reality. We invite participants to play a game where they are in charge of the presented delay, facilitating their first-hand experience of the consequences of delay. Furthermore, the demonstration serves as a platform for future explorations into short- and long-term effects of virtual reality constraints.

Influence of primacy, recency and peak effects on the game experience questionnaire

  •      Saeed Shafiee Sabet
  • Carsten Griwodz
  • Sebastian Möller

When a participant is asked to evaluate a stimulus, the judgment is based on the remembered experience, which might be different from the actual experience. This phenomenon happens according to the theory that some moments of an experience such as the beginning, peak and the end of the experience have more impact on the memory. These moments can be recalled with a higher probability than the other parts of the experience, and some minor bad moments of experience might be forgotten or forgiven due to the rest of the good experiences. This paper, using a subjective study and emulating an artificial delay on participants' gameplay investigates the influence of these serial-position effects on the Game Experience Questionnaire (GEQ). The result shows that GEQ does not suffer from either recency, primacy or peak effects. However, when users are asked about the controllability and responsiveness of the games, the recency effect exists. The paper also shows that GEQ has the forgiveness effect and participants forgive or may forget a bad experience if it coincides with a considerable duration of a good experience.

Fusion confusion: exploring ambisonic spatial localisation for audio-visual immersion using the McGurk effect

  •      Abubakr Siddig
  • Alessandro Ragano
  • Hamed Z. Jahromi
  • Andrew Hines

Virtual Reality (VR) is attracting the attention of application developers for purposes beyond entertainment including serious games, health, education and training. By including 3D audio the overall VR quality of experience (QoE) will be enhanced through greater immersion. Better understanding the perception of spatial audio localisation in audio-visual immersion is needed especially in streaming applications where bandwidth is limited and compression is required. This paper explores the impact of audio-visual fusion on speech due to mismatches in a perceived talker location and the corresponding sound using a phenomenon known as the McGurk effect and binaurally rendered Ambisonic spatial audio. The illusion of the McGurk effect happens when a sound of a syllable paired with a video of a second syllable, gives the perception of a third syllable. For instance the sound of /ba/ dubbed in video of /ga/ will lead to the illusion of hearing /da/. Several studies investigated factors involved in the McGurk effect, but a little has been done to understand the audio spatial effect on this illusion. 3D spatial audio generated with Ambisonics has been shown to provide satisfactory QoE with respect to localisation of sound sources which makes it suitable for VR applications but not for audio visual talker scenarios. In order to test the perception of the McGurk effect at different direction of arrival (DOA) of sound, we rendered Ambisonics signals at the azimuth of 0°, 30°, 60°, and 90° to both the left and right of the video source. The results show that the audio visual fusion significantly affects the perception of the speech. Yet the spatial audio does not significantly impact the illusion. This finding suggests that precise localisation of speech audio might not be as critical for speech intelligibility. It was found that a more significant factor was the intelligibility of speech itself.

Towards a distributed reality: a multi-video approach to xR

  •      Alvaro Villegas
  • Pablo Pérez
  • Ester González-Sosa

In this article we define the concept of Distributed Reality, a new form of immersive media based on the combination of captured realities (e.g a remote and a local one), hence with a strong focus on video. While Virtual Reality fully substitutes the real surrounding with a simulation and all existing variations of Mixed Reality just provide additional synthetic elements into our visual field, our proposal is to capture and segment different scenes (one of them being the local one) in real time and merge the resulting elements dynamically, to create a completely new reality in front of the user senses. We compare this concept with prior art in the field and describe a taxonomy in which it fits.

Field-of-view prediction in 360-degree videos with attention-based neural encoder-decoder networks

  •      Jiang Yu
  • Yong Liu

In this paper, we propose attention-based neural encoder-decoder networks for predicting user Field-of-View (FoV) in 360-degree videos. Our proposed prediction methods are based on the attention mechanism that learns the weighted prediction power of historical FoV time series through end-to-end training. Attention-based neural encoder-decoder networks do not involve recursion, thus can be highly parallelized during training. Using publicly available 360-degree head movement datasets, we demonstrate that our FoV prediction models outperform the state-of-art FoV prediction models, achieving lower prediction error, higher training throughput, and faster convergence. Better FoV prediction leads to reduced bandwidth consumption, better video quality, and improved user quality of experience.