IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security


IH&MMSec '20: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security

Full Citation in the ACM Digital Library

SESSION: Keynote & Invited Talks

Exploiting Micro-Signals for Physiological Forensics

  • Min Wu

A variety of nearly invisible "micro-signals" have played important roles in media security and forensics. These noise-like micro-signals are ubiquitous and typically an order of magnitude lower in strength or scale than the dominant ones. They are traditionally removed or ignored as nuances outside the forensic domain. This keynote talk discusses the recent research harnessing micro-signals to infer a person's physiological conditions. One type of such signals is the subtle changes in facial skin color in accordance with the heartbeat. Video analysis of this repeating change provides a contact-free way to capture photo-plethysmogram (PPG). While heart rate can be tracked from videos of resting cases, it is challenging to do so for cases involving substantial motion, such as when a person is walking around, running on a treadmill, or driving on a bumpy road. It will be shown in this talk how the expertise with micro-signals from media forensics has enabled the exploration of new opportunities in physiological forensics and a broad range of applications.

Game-Theoretic Perspectives and Algorithms for Cybersecurity

  • Christopher Kiekintveld

Information plays a key role in many games, and game theory includes reasoning about how agents should perceive signals, and how they should strategically decide what signals to send. This can involve complex tradeoffs about how revealing certain information will affect the beliefs and actions of other players. I will overview some basic approaches for modeling information in game theory, such as signaling games, and applications to games such as Poker. The second part of the talk with focus on our work applying game theoretic models and algorithms in cybersecurity. I will discuss how we apply game theory to optimize strategies for deception in cybersecurity, including honeypots, honey traffic, and other deceptive objects. I will also cover work that considers dynamic deception using sequential models that capture uncertainty. Finally, I will discuss some recent work in adversarial learning and connections between this area and game theory.

SESSION: Session 1: Steganalysis

Linguistic Steganalysis via Densely Connected LSTM with Feature Pyramid

  • Hao Yang
  • YongJian Bao
  • Zhongliang Yang
  • Sheng Liu
  • Yongfeng Huang
  • Saimei Jiao

With the growing attention on multimedia security and rapid development of natural language processing technologies, various linguistic steganographic algorithms based on automatic text generation technology have been proposed increasingly, which brings great challenges in maintaining security of cyberspace. The prevailing linguistic steganalysis methods based on neural networks only conduct linguistic steganalysis with feature vectors from last layer of neural network, which may be insufficient for neural linguistic steganalysis. In this paper, we propose a neural linguistic steganalysis scheme based on densely connected Long short-term memory networks (LSTM) with feature pyramids which can incorporate more low level features to detect generative text steganographic algorithms. In the proposed framework, words in text are firstly mapped into semantic space with a hidden representation for better exploitation of the semantic features. Then, stacked bidirectional Long short-term memory networks are ultilized to extract different levels of semantic features. In order to incorporate more low level features from neural networks, we introduced two components: dense connections and feature pyramids to enhance the low level features in feature vectors. Finally, the semantic features from all levels are fused and we use a sigmoid layer to categorize the input text as cover or stego. Experiments showed that the proposed scheme can achieve the state-of-the-art results in detecting recently proposed linguistic steganographic algorithms.

Deep Audio Steganalysis in Time Domain

  • Daewon Lee
  • Tae-Woo Oh
  • Kibom Kim

Digital audio, as well as image, is one of the most popular media for information hiding. However, even the state-of-the-art deep learning model still has a limitation for detecting basic LSB steganography algorithms that hide secret messages in time domain of WAV audio. To advance audio steganalysis based on deep learning, deep audio steganalysis, in time domain of lossless audio format, we have developed a convolutional neural network that incorporates bit-plane separation, weight-standardized convolution, and channel attention. Training through payload curriculum learning and testing for six steganography methods demonstrated that our proposed model is superior to the other two deep learning models, achieving state-of-the-art performance. We expect our approach will provide insights to create a breakthrough for deep audio steganalysis.

Reinforcement Learning Aided Network Architecture Generation for JPEG Image Steganalysis

  • Jianhua Yang
  • Beiling Lu
  • Liang Xiao
  • Xiangui Kang
  • Yun-Qing Shi

The architectures of convolutional neural networks used in steganalysis have been designed heuristically. In this paper, an automatic Network Architecture Generation algorithm based on reinforcement learning for JPEG image Steganalysis (JS-NAG) has been proposed. Different from the automatic neural network generation methods in computer vision which are based on the strong content signals, steganalysis is based on the weak embedded signals, thus needs specific design. In the proposed method, the agent is trained to sequentially select some high-performing blocks using Q-learning to generate networks. An early stop strategy and a well-designed performance prediction function have been utilized to reduce the search time. To generate the optimal networks, hundreds of networks have been searched and trained on 3 GPUs for 15 days. To further improve the detection accuracy, we make an ensemble classifier out of the generated convolutional neural networks. The experimental results have shown that the proposed method significantly outperforms the current state-of-the-art CNN based methods.

Feature Aggregation Networks for Image Steganalysis

  • Haneol Jang
  • Tae-Woo Oh
  • Kibom Kim

Since convolutional neural networks have shown remarkable performance on various computer vision tasks, many network architectures for image steganalysis have been introduced. Many of them use fixed preprocessing filters for stable learning, which have a disadvantage of limited use of the information of the input image. The recently introduced end-to-end learning method uses a structure that limits the number of channels of feature maps close to the input and stacks residual blocks. This method has limitations in generating feature maps of various levels and resolutions that can be effective for steganalysis. We therefore propose the feature aggregation-based steganalysis networks: expand the number of channels of convolutional blocks close to the input data, aggregate feature maps of various levels and resolutions, and utilize rich information to improve steganalysis performance. In addition, the capped activation function is applied to obtain better generalization performance. The proposed method outperforms the state-of-the-art steganalysis on detection of the advanced steganography algorithms J-UNIWARD and UED, for JPEG quality factor 75 and 95.

Pixels-off: Data-augmentation Complementary Solution for Deep-learning Steganalysis

  • Mehdi Yedroudj
  • Marc Chaumont
  • Frederic Comby
  • Ahmed Oulad Amara
  • Patrick Bas

After 2015, CNN-based steganalysis approaches have started replacing the two-step machine-learning-based steganalysis approaches (feature extraction and classification), mainly due to the fact that they offer better performance.

In many instances, the performance of these networks depend on the size of the learning database. Until a certain point, the larger the database, the better the results. However, working with a large database with controlled acquisition conditions is usually rare or unrealistic in an operational context. An easy and efficient approach is thus to augment the database, in order to increase its size, and therefore to improve the efficiency of the steganalysis process.

In this article, we propose a new way to enrich a database in order to improve the CNN-based steganalysis performance. We have named our technique "pixels-off". This approach is efficient, generic, and is usable in conjunction with other data-enrichment approaches. Additionally, it can be used to build an informed database that we have named "Side-Channel-Aware databases" (SCA-databases).

SESSION: Session 2: Privacy and Security

Protecting Smartphone Screen Notification Privacy by Verifying the Gripping Hand

  • Chen Wang
  • Jingjing Mu
  • Long Huang

As the most common personal devices, smartphones contain the user's private information. While people use mobile devices anytime and anywhere, the sensitive contents might be leaked from the screens. The smartphone notifications cause such privacy leakages even on a lock screen. With the aim to alert the user of an event (e.g., text messages, phone calls and calendar reminders), these onscreen notifications usually contain the sender's name and even a clip of the contents for preview. Such information, if not displayed appropriately, may cause the leakages of the user's social relations, personal hobbies and private message contents. This work focuses on wisely displaying the notifications to avoid leaking the user's privacy. We develop an unobtrusive user authentication system to confirm the user identity via their gripping-hands before displaying notifications. In particular, we carefully design an inaudible acoustic signal and emit it from the smartphone speaker to sense the gripping hand, when there is a need to push notifications. The signal propagating to the smartphone's microphones carries the user's biometric information related to the gripping hand (e.g., palm size and gripping strength). We further derive the Mel Frequency Cepstral Coefficient time series and develop a machine learning-based algorithm to identify the user. The experimental results show that our system can identify 8 users with 92% accuracy.

What if Adversarial Samples were Digital Images?

  • Benoît Bonnet
  • Teddy Furon
  • Patrick Bas

Although adversarial sampling is a trendy topic in computer vision, very few works consider the integral constraint: The result of the attack is a digital image whose pixel values are integers. This is not an issue at first sight since applying a rounding after forging an adversarial sample trivially does the job. Yet, this paper shows theoretically and experimentally that this operation has a big impact. The adversarial perturbations are fragile signals whose quantization destroys its ability to delude an image classifier.

This paper presents a new quantization mechanism which preserves the adversariality of the perturbation. Its application outcomes to a new look at the lessons learnt in adversarial sampling.

LiveDI: An Anti-theft Model Based on Driving Behavior

  • Hashim Abu-gellban
  • Long Nguyen
  • Mahdi Moghadasi
  • Zhenhe Pan
  • Fang Jin

Anti-theft problem has been challenging since it mainly depends on the existence of external devices to defend from thefts. Recently, driver behavior analysis using supervised learning has been investigated with the goal to detect burglary by identifying drivers. In this paper, we propose a data-driven technique, LiveDI, which uses driving behavior removing the use of external devices in order to identify drivers. The built model utilizes Gated Recurrent Unit (GRU) and Fully Convolutional Networks (FCN) to learn long-short term patterns of the driving behaviors from drivers. Additionally, we improve the training time by utilizing the Segmented Feature Generation (SFG) algorithm to reduce the state space where the driving behaviors are split with a time window for analysis. Extensive experiments are conducted which show the impact of parameters on our technique and verify that our proposed approach outperforms the state-of-the-art baseline methods.

On the Difficulty of Hiding Keys in Neural Networks

  • Tobias Kupek
  • Cecilia Pasquini
  • Rainer Böhme

In order to defend neural networks against malicious attacks, recent approaches propose the use of secret keys in the training or inference pipelines of learning systems. While this concept is innovative and the results are promising in terms of attack mitigation and classification accuracy, the effectiveness relies on the secrecy of the key. However, this aspect is often not discussed. In this short paper, we explore this issue for the case of a recently proposed key-based deep neural network. White-box experiments on multiple models and datasets, using the original key-based method and our own extensions, show that it is currently possible to extract secret key bits with relatively limited effort.

Nested Tailbiting Convolutional Codes for Secrecy, Privacy, and Storage

  • Thomas Jerkovits
  • Onur Günlü
  • Vladimir Sidorenko
  • Gerhard Kramer

The key agreement problem with biometric or physical identifiers and two terminals for key enrollment and reconstruction is considered. A nested convolutional code construction that performs lossy compression with side information is proposed. Nested convolutional codes are an alternative to nested polar codes and nested random linear codes that achieve all points of the key-leakage-storage regions of the generated-secret and chosen-secret models for long block lengths. Our design uses a convolutional code for vector quantization during enrollment and a subcode of it for error correction during reconstruction. Physical identifiers with small bit error probability are considered to illustrate the gains of the proposed construction. One variant of nested convolutional codes improves on all previous constructions in terms of the key vs. storage rate ratio but it has high complexity. Another variant of nested convolutional codes with lower complexity performs similarly to previously designed nested polar codes. The results suggest that the choice of convolutional or polar codes for key agreement with identifiers depends on the complexity constraints.

SESSION: Session 3: Image and Video Forensics

Simulation of Border Control in an Ongoing Web-based Experiment for Estimating Morphing Detection Performance of Humans

  • Andrey Makrushin
  • Dennis Siegel
  • Jana Dittmann

A morphed face image injected into an identity document destroys the unique link between a person and a document meaning that such a multi-identity document may be successfully used by several persons for face-recognition-based identity verification. A morphed face in an electronic machine readable travel document may allow a wanted criminal to illicitly cross a border. This paper describes an improvement of our ongoing web-based experiment for a border control simulation in which human examiners should first detect high-resolution morphed face images and second match potentially morphed document images against "live" faces of travelers. The error rates of humans in both parts of the experiment are compared with those of automated morphing detectors and face recognition systems. This experiment improves understanding the capabilities and limits of humans in withstanding the face morphing attack as well as the factors influencing their performance.

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

  • Irene Amerini
  • Roberto Caldelli

The ability of artificial intelligence techniques to build synthesized brand new videos or to alter the facial expression of already existing ones has been efficiently demonstrated in the literature. The identification of such new threat generally known as Deepfake, but consisting of different techniques, is fundamental in multimedia forensics. In fact this kind of manipulated information could undermine and easily distort the public opinion on a certain person or about a specific event. Thus, in this paper, a new technique able to distinguish synthetic generated portrait videos from natural ones is introduced by exploiting inconsistencies due to the prediction error in the re-encoding phase. In particular, features based on inter-frame prediction error have been investigated jointly with a Long Short-Term Memory (LSTM) model network able to learn the temporal correlation among consecutive frames. Preliminary results have demonstrated that such sequence-based approach, used to distinguish between original and manipulated videos, highlights promising performances.

Photo Forensics From Rounding Artifacts

  • Shruti Agarwal
  • Hany Farid

Many aspects of JPEG compression have been successfully used in the domain of photo forensics. Adding to this literature, we describe a JPEG artifact that can arise depending upon seemingly innocuous implementation details in a JPEG encoder. We describe the nature of these artifacts and show how a generic JPEG encoder can be configured to explain a wide range of these artifacts found in real-world cameras. We also describe an algorithm to simultaneously estimate the nature of these artifacts and localize inconsistencies that can arise from a wide range of image manipulations.

SESSION: Session 4: Steganography

Information Hiding in Industrial Control Systems: An OPC UA based Supply Chain Attack and its Detection

  • Mario Hildebrandt
  • Kevin Lamshöft
  • Jana Dittmann
  • Tom Neubert
  • Claus Vielhauer

Industrial Control Systems (ICS) help to automate various cyber-physical systems in our world. The controlled processes range from rather simple traffic lights and elevators to complex networks of ICS in car manufacturing or controlling nuclear power plants. With the advent of industrial Ethernet ICS are increasingly connected to networks of Information Technology (IT). Thus, novel attack vectors on ICS are possible. In IT networks information hiding and steganography is increasingly used in advanced persistent threats to conceal the infection of the systems allowing the attacker to retain control over the compromised networks. In parallel ICS are more and more a target for attacks as well. Here, simple automated attacks as well as targeted attacks of nation state actors with the intention of damaging components or infrastructures as a part of cyber crime have already been observed. Information hiding could bring such attacks to a new level by integrating backdoors and hidden/covert communication channels that allow for attacking specific processes whenever it is deemed necessary. This paper sheds light on potential attack vectors on Programmable Logic Controllers (PLCs) using OPC Unified Architecture (OPC UA) network protocol based communication. We implement an exemplary supply chain attack consisting of an OPC UA server (Bob, B) and a Siemens S7-1500 PLC as OPC UA client (Alice, A). The hidden storage channel is using source timestamps to embed encrypted control sequences allowing for setting digital outputs to arbitrary values. The attack is solely relying on the programming of the PLC and does not require firmware level access. Due to the potential harm to life caused by attacks on cyber-physical systems any presentation of novel attack vectors need to present suitable mitigation strategies. Thus, we investigate potential approaches for the detection of the hidden storage channel for a warden W as well as potential countermeasures in order to increase the warden-compliance. Our machine learning based detection approach using a One-Class-Classifier yields a detection performance of 89.5% with zero false positives within an experiment with 46,159 OPC UA read responses without a steganographic message and 7,588 OPC UA read responses with an embedded steganographic message.

Simulating Suboptimal Steganographic Embedding

  • Christy Kin-Cleaves
  • Andrew D. Ker

Researchers who wish to benchmark the detectability of steganographic distortion functions typically simulate stego objects. However, the difference (coding loss) between simulated stego objects, and real stego objects is significant, and dependent on multiple factors. In this paper, we first identify some factors affecting the coding loss, then propose a method to estimate and correct for coding loss by sampling a few covers and messages. This allows us to simulate suboptimally-coded stego objects which are more accurate representations of real stego objects. We test our results against real embeddings, and naive PLS simulation, showing our simulated stego objects are closer to real embeddings in terms of both distortion and detectability. This is the case even when only a single image and message as used to estimate the loss.

A Robust Video Steganographic Method against Social Networking Transcoding Based on Steganographic Side Channel

  • Pingan Fan
  • Hong Zhang
  • Yifan Cai
  • Pei Xie
  • Xianfeng Zhao

The social networks transcode uploaded videos in a lossy way, which makes most video steganographic methods become unusable. In this paper, a robust video steganographic method is proposed to resist video transcoding on social networking sites. The luminance component of the raw video is selected as the cover and Quantization Index Modulation (QIM) algorithm based on block statistical features is applied to embed secret messages. To make a good tradeoff between the robustness and visual quality, an iteration in the local transcoder is designed to determine the minimum quantization step for each video. Then, a strategy of selecting robust video frames is proposed to further improve the robustness and security. To avoid sharing information beforehand between the sender and the receiver, a steganographic side channel is built for correct message extraction. Experimental results have shown that our proposed method can provide strong robustness against social networks transcoding, the average bit error rate is less than 1%. Meanwhile, our proposed method achieves a satisfactory level of security performance. It's a robust and secure method for covert communication on social networking sites such as YouTube and Vimeo.

JPEG Steganography and Synchronization of DCT Coefficients for a Given Development Pipeline

  • Théo Taburet
  • Patrick Bas
  • Wadih Sawaya
  • Rémi Cogranne

This paper proposes to use the statistical analysis of the correlation between DCT coefficients to design a new synchronization strategy that can be used for cost-based steganographic schemes in the JPEG domain. First, an analysis is performed on the covariance matrix of DCT coefficients of neighboring blocks after a development pipeline similar to the one used to generate BossBase, and applied on a photonic noise. This analysis exhibits (i) a decomposition into 8 disjoint sets of uncorrelated coefficients (4 sets per block used by 2 disjoint lattices) and (ii) the fact that each DCT coefficient is correlated with 38 other coefficients belonging either to the same block or to connected blocks. Using the uncorrelated groups, an embedding scheme can be designed using only 8 disjoint lattices. The proposed embedding scheme relies on ingredients. Firstly, we convert the empirical costs associated to one each coefficient into a Gaussian distribution whose variance is directly computed from the embedding costs. Secondly we derive conditional Gaussian distributions from a multivariate distribution considering only the correlated coefficients which have been already modified by the embedding scheme. This covariance matrix takes into account both the correlations exhibited by the analysis of the covariance matrix and the variance derived from the costs. This synchronization scheme enables to obtain a gain of $P_E$ of at least $7%$ at $QF95$ for an embedding rate close to 0.3 bnzac coefficient using DCTR feature sets for both UERD and JUniward.

Turning Cost-Based Steganography into Model-Based

  • Jan Butora
  • Yassine Yousfi
  • Jessica Fridrich

Abstract Most modern steganographic schemes embed secrets by minimizing the total expected cost of modifications. However, costs are usually computed using heuristics and cannot be directly linked to statistical detectability. Moreover, as previously shown by Ker at al., cost-based schemes fundamentally minimize the wrong quantity that makes them more vulnerable to knowledgeable adversary aware of the embedding change rates. In this paper, we research the possibility to convert cost-based schemes to model-based ones by postulating that there exists payload size for which the change rates derived from costs coincide with change rates derived from some (not necessarily known) model. This allows us to find the steganographic Fisher information for each pixel (DCT coefficient), and embed other payload sizes by minimizing deflection. This rather simple measure indeed brings sometimes quite significant improvements in security especially with respect to steganalysis aware of the selection channel. Steganographic algorithms in both spatial and JPEG domains are studied with feature-based classifiers as well as CNNs.

Steganography by Minimizing Statistical Detectability: The cases of JPEG and Color Images

  • Rémi Cogranne
  • Quentin Giboulot
  • Patrick Bas

This short paper presents a novel method for steganography in JPEG-compressed images, extended the so-called MiPOD scheme based on minimizing the detection accuracy of the most-powerful test using a Gaussian model of independent DCT coefficients. This method is also applied to address the problem of embedding into color JPEG images. The main issue in such case is that color channels are not processed in the same way and, hence, a statistically based approach is expected to bring significant improvements when one needs to consider heterogeneous channels together.

The results presented show that, on the one hand, the extension of MiPOD for JPEG domain, referred to as J-MiPOD, is very competitive as compared to current state-of-the-art embedding schemes. On the other hands, we also show that addressing the problem of embedding in JPEG color images is far from being straightforward and that future works are required to understand better how to deal with color channels in JPEG images.