Sponsors

  • Google
  • Arizona State University
  • The University Of Texas at Dallas
  • Qualcomm
  • SIGMM
  • IBM
  • Microsoft
  • FXPAL
  • Technicolor
  • Springer
  • Yahoo!

Supporters

  • NSF

Semantic Computing in Multimedia

By Simone Santini
Universidad Autonoma de Madrid, Spain
simone.santini@uam.es

Semantics, as a cognitive computing topic (as opposed, for example, to the formal semantics of programming languages) began within artificial intelligence and then, beginning in the 1980s, faded from public attention following the then general decline of the interest in symbolic artificial intelligence. Those were the heydays of connectionism, and connectionist machines, ex hypothesi, would not model semantics explicitly. In the last ten years, however, there has been a noteworthy resurgence of the technical discourse on semantics, based on the widespread opinion that the large amount of data available today can be properly managed only through a qualitative leap in the processing capabilities of computing machines. A semantic leap, as it is.

The purpose of this tutorial is not only to give the attendants information about standards and programming techniques, but also to give them a better view of the broader topics in which semantic computing is embedded. Learning techniques and standards is the easy part of the job; the difficult part, the one that needs a face-to-face interaction typical of a tutorial is to know what to do with these standards and techniques, that is, to understand the general theory of semantics and how the different techniques fit in it. Semantics is a complex issue, with a history of many centuries and a variety of different points of view. In order to do serious research on semantics, the computing scientist must be aware of important and complex theoretical questions, and of the solutions and models that the different schools have proposed. This tutorial will try to provide such a background.

After a brief introductory section with a brief history of semantics and pictorial communication, the tutorial will be divided in two parts, corresponding to the two fundamental approaches, which I shall call the ontological and the hermeneutical.

The first part will deal with all those approaches that try to encode formally the semantics of a document and attach it to the document itself. It will cover, roughly, a terrain that goes from Tarski to current ontologies, with some emphasis on model theory and some foray into partially uncharted waters, such as the use of fuzzy logic for multimedia modeling. This section, in turn, will be divided in two parts. The first, and longest, will be a technical discussion on the different concepts of semantics that have been used in logic, with special emphasis on aspect of formal semantics and on traditional knowledge representation (including ontologies, the semantic web, and their relation to multimedia). The second part will be a brief excursus on the presuppositions that underlie this work. Traditional logic is a discipline of formal reasoning and never quite dealt with the content (viz. the semantics) of statements. Ever since Aristotle, the systematization of the syllogism operated by the Scolastics, all the way to the axiomatic programme of Russel and the Analytic Philosophy, logic has been a science of the forms of reasoning, without any reference to the contents of the reasoning activity. We will look into this separation of form and substance, and analyze its plausibility and its consequences for multimedia semantics.

To say that the meaning of a document (multimedia or otherwise) can be characterized by a formal model attached to the document requires certain assumptions, at the basis of which is the idea that a document has a content, independent (more or less) of the linguistic means that are used to express it, and that exists (more or less) intact even if nobody is interpreting the document. We will analyze critically this view of meaning, its plausibility, and the limits of its validity.

The second part will tie-in with the non-technical discussion at the end of the first part. While analyzing the presuppositions of the logic approach to semantics, we will also begin to look at alternative views, with a special emphasis on two areas: hermeneutics and structural semantics. We will work to understand the role of the reader in the creation of meaning, the role of the discoursive practices of media creation, and that of the cultural conventions that drive the way in which media should be interpreted.

The study of signification carried out in this way will unveil several important characteristics for the design of semantic systems. Meaning is not an attribute of an image or a video, but something that arises when an artifact is used as part of an activity, and only makes sense in the context of that activity. That is, meaning is created when an artifact is interpreted in a context, and as part of an activity. From the point of view of the design of computing systems, this means that, rather than modeling the content of documents, we should model the activities that require access to the images and the context in which these activities take place.

With a bit of simplification, we can say that the ontological approaches use the data base as their formal model, attaching a formal model to semi-structured data, while the hermeneutic approaches take interactive systems as their base, and try to make meaning emerge through more sophisticated forms of interaction.

Orthogonal to this distinction is that which separates the models based on logic from those based of soft computing. The two distinctions (ontological vs. hermeneutical and logic vs. soft computing) are not completely independent: by and large, ontological models tend to use a logic machinery (vide OWL and description logic), while hermentutics and interaction tend to use soft computing (feature space geometry, latent semantics, self organizing maps,...). The reason for this is to be sought in the different characteristics of the two approaches: logic methods are very expressive but brittle, they behave poorly in the presence of inconsistent data, and are hard to built by induction from the data; soft methods are not very expressive, but they can be built inductively and automatically on the basis of available data. We will study the characteristics of these two modes of representing semantics, with particular attention to the model that try to move beyond this dichotomy, such as the use of fuzzy logic ontological models that can be inferred from the data, and non-classical logic models such as those based on semantic games, that can be usefully applied to interactive systems.

The attendants will receive didactic material specifically designed for the tutorial. The material will not be composed simply of a hard copy of the transparencies used, but will be a booklet that will constitute a complete reference for the topics studied in the tutorial.

ACM Multimedia 2011

Nov 28th - Dec 1st, 2011 Scottsdale, Arizona, USA

Back To Top