Isabel F. Cruz
Wendy T. Lucas
Department of Computer Science
Database Visualization Research Group
Worcester Polytechnic Institute
Tufts University
ifc@cs.wpi.edu
wlucas@cs.tufts.edu
http://www.cs.wpi.edu/People/faculty/ifc.html
http://www.cs.tufts.edu/~wlucas
Multimedia data has become readily available from a variety of resources, such as the Web, to users (ranging from naive to sophisticated) who need to select and to present the data in a way that is meaningful to their particular applications. DelaunayMM is our framework for querying and presenting multimedia data stored in distributed data repositories, including the Web. It is unique in combining user-defined layouts with ad hoc querying capabilities, thereby enabling users to tailor, in a simple way, the layout of virtual documents composed of retrieved multimedia objects. In this paper, we focus on the object-oriented data models, on the declarative query languages, and on how the results of the queries to disparate resources are integrated to form coherent user-defined documents.
To address these new requirements, concepts and tools are needed that enable users ranging from naive to sophisticated to not only select the information they need but also to present it in a way that is meaningful to their particular application [17,21]. Our framework for querying and presenting multimedia data stored in distributed data repositories, including the Web, is called Delaunay. Its uniqueness lies in its combination of user-defined virtual document layouts with the ability to define document content through ad hoc queries to multiple repositories.
Delaunay is a multimedia extension to the Delaunay Database
Visualization System [9], an interactive,
[5]
constraint-based system for visualizing object-oriented databases.
Delaunay users pictorially specify, in an intuitive yet formal way,
the visualization of database objects. By arranging graphical
geometric objects and graphical constraints, users form a ``picture"
that specifies how to visualize data objects belonging to a
class. Following a similar approach, users of
Delaunay visually represent the spatial layout of the data
to be retrieved from distributed multimedia repositories.
The Delaunay document layout model defines a virtual document as being a set of user-specified style sheets. Therefore, the layout of a document is based on one or more style sheets (e.g., for the layout of the title page or of the chapter pages). Within the document, a set of pages is associated with each style sheet, which serves as a template for the layout of these pages. The user associates queries with the style templates, thus combining data selection with presentation.
Graphical icons, including a scrollable box for text, a re-sizable window for images, and a control box for audio, are assigned to each query and given presentation attributes. The icons are then arranged into a style sheet by either snapping to a grid or by explicitly specifying spatial constraints [8].
The Delaunay query language interface supports standard SQL clauses including select, from, and where. It is flexible enough to address queries to distributed relational and object-oriented databases as well as to the Web. In the latter case, an object-oriented model of multimedia documents and elements provides the attributes on which to query. This model extends the HTML 3.2 DTD [22] by incorporating additional metadata attributes, including some from an emerging standard called the STARTS protocol for Internet retrieval [15]. Queries to the Web are complicated by its cyclic structure and the fact that the destination for a query is often not known ahead of time (for example, in relational queries the names of the tables storing the sought-after data are supplied by the person forming the query; on the other hand, a query to the Web may or may not include a URL from which to extract the data). Navigational queries, which enable the browsing of document links during query processing, and keyword searches by multiple search engines are therefore supported.
By combining the user-defined style templates with the answers to the queries, a virtual document with pages populated with the retrieved multimedia objects is automatically generated. Each page is associated with one style sheet that determines the layout of the page's elements. Pages are linked together and can be traversed via a previous/next mechanism.
The content of each page is based on the answers of the queries associated with it. In some cases, more than one set of page elements (i.e., multimedia objects) may be retrieved in response to a query. The default display specification is to show sets of objects in order of retrieval, with additional sets connected by links that are traversed in a similar manner to page links.
Thumbnail views of each page provide an overview of the entire document, as shown in Figure 1.. The style sheet for this document contains a text icon, an image icon, and an audio icon. Queries to populate the icons on each page are first translated into the syntax of the repository to be queried. For example, queries sent to the Web are translated into WebSQL queries [20]. After invoking the queries, further processing of the retrieved objects is performed in order to create the user-specified presentation.
Figure 1: Thumbnail overview of a document.
Within a thumbnail view, pages are arranged in accordance with their position within the generated document, and can be reordered via a drag and drop operation. Selecting a particular thumbnail enlarges that page and makes it the active view.
In addition to the Web, an example of an application domain for which Delaunay is currently being implemented is the Perseus Project, a digital library on ancient Greek culture [4]. Knowledge of the data schema, as captured by the data wrapper, provides the attributes on which to query. By providing an integrated query/presentation interface, visitors to the Perseus site [5] will be able to examine the many vases, coins, texts, and other works in ways that are currently not possible. For example, one could display multiple views of one piece of sculpture, compare the same view of many different vases, or arrange a virtual document in which each page represents the artwork of a different artist. In this last case, users could click on one image of a work by a particular artist to view the next work in the set, or could view the works of another artist by clicking on the link to the next page.
Following [9], two types of save operations are required to take full advantage of the capabilities inherent in the framework presented here. One saves the actual virtual document for future viewing. The other saves only the query and layout specifications, so that new virtual documents based on previous specifications can be generated, either by editing the specifications or by using more sophisticated mechanisms, such as inheritance and deductive rules (see [6,7]).
The remainder of this paper is organized as follows. Section 2 describes the Layout and Query Framework, including the Delaunay layout and data models. Section 3 contains descriptions of Query Processing and Virtual Document Generation. Our current implementation is described in Section 4, while Section 5 contains a comparison with related work. Our paper concludes with the discussion of future work in Section 6.
The simplest way to specify the layout is by snapping to a grid and adjusting the icons to fill the desired space. Each icon will ultimately be replaced by a set of objects that fit the query criteria associated with it. Rather than snapping to a grid, the user can enter numerical values for the dimensional attributes of an icon, such as length and width for a text box, or can place visually specified constraints on the values of those attributes [8]. These constraints are (1) length constraints or (2) overlap constraints (if an object is to be placed on top of another). Length constraints are linear (unary, binary, or ternary), maximum or minimum constraints.
Since more than one object within a class may satisfy the query, one can specify how many instances of each class to view at a time by selecting a predefined presentation view. Alternatively, links inherent to a chosen presentation (e.g., stack of cards) can provide the navigational path from one element of the set to the next.
Also assigned to each icon are presentation attributes, such as font for text and color composition for images, whose values are specified by the user. All instances of the database class that fit the query criteria for an icon are presented throughout the document in accordance with these attributes.
The layout shown in Figure 2 is an example of a style sheet containing an image icon, a scrollable text box, and a non-scrollable text box. Figure 3 shows the query tree associated with that style sheet, which will be described in Section 2.2.
Figure 2: Style sheet.
Figure 3: Query tree.
In this example, the length and width of the image icon are proportional to those of the largest image that will be contained in that space. These constraints are specified via dialog box options for the length and width attributes. For the non-scrollable text box, the ``fill area" attribute has been selected, so that the font and letter size of the text to appear there will be automatically chosen to fill the specified area. A maximum constraint on the height of the page is set to be either (1) the sum of the heights of the image object, the non-scrollable text box, and the space between them, or (2) the height of the scrollable text box, whichever is greater.
In addition to the layout of each page associated with a particular style sheet, the user can organize the overall layout of a virtual document by specifying the relationships between the sets of pages belonging to the different style sheets. Figure 4 shows a layout
Figure 4: Tree layout of a virtual document.
for a virtual document, or ``book'' on ``Greek Vases'' composed with objects resulting from queries to the Perseus database. It depicts a hierarchical organization, with the page associated with the ``book cover'' style sheet at the top. The level below contains the cover pages for the ``chapters'' of the book. The pages contained in the chapter on vases found at Harvard University are drawn as children of the Harvard node of the hierarchy. The specification of the layout of virtual documents is achieved using visual rules [11].
The object-oriented model chosen for representing the document layout and the retrieved data is based on the O2 [13] and F-logic [18] data models. Figure 5 shows the structure of this data model for a virtual document. It has the two primitive type constructors: tuple and set. Syntactically in our representation, tuples are included between square brackets and sets are included within braces.
Figure 5: Document layout model.
The Document class is defined as a tuple containing name and styles attributes. The latter is a set valued attribute, since its value is a set of objects of class Style. The different objects of class Style allow the user to model the different kinds of pages found in virtual documents, as previously described.
As an example, there may be one page with a ``Table of Contents" style, and many pages with a ``Body" style. Attributes of the Style class are (1) description, which contains a name (of class string given by the user) and (2) pages, which contains the set of page objects inheriting the layout defined for a particular style.
The Page class has the attribute elements, which is set-valued. Each element of the set is a tuple with two attributes: icon_id and location. The value of the latter is a set of coordinates that define the position of the icon within a page. Other attributes of the Page class are a reference to the next page, and one to the previous page.
An object of class Icon is a tuple made up of a data attribute and a query attribute. The data attribute is associated with a data set (e.g., the set of all images of Greek vases). Each data element in the set has a physical identifier (pid) to denote the data repository in which it resides and a value to identify it within that repository. For Web-based data, that value is its URL. A set of data points representing a physical location within an icon is also associated with each data element (e.g., the coordinates of the lower left corner and of the upper right corner of a rectangular region). The query attribute stores the query used to populate the icon, as described in the next section.
The multimedia classes of Text, Image, Audio, and Video are all subclasses of the Icon class. Each inherits the data and query attributes, and in addition has its own type-specific ones. For example, attributes of class Text include font and size, while attributes of class Image include color content and resolution.
Grouping icons together has both a presentation and a query significance. In terms of presentation, elements of sets associated with one icon are matched with elements of sets associated with the other icons in the group. When the user iterates on a group, the next object in all sets within the group is displayed.
Icons within a query group are the values for the select portion of the query. Iterating through an inner query group will change only the presentation associated with that particular grouping. Iterating through the outermost query group will change the presentation for the entire page (this is similar to nested loops in a programming language, where the inner loop changes ``more quickly'' than the outer loop).
An example illustrating the query formation process that uses the Perseus database is the creation of a book of vases from the Harvard Art Museums. The user first creates the style sheet of Figure 2, and places the image and text box (which contains a label associated with that particular view) within one query group, so that the two change together as she iterates through the many different views of each vase. The text area icon, however, is in its own grouping box, because the texts she will be retrieving relate to the vase as a whole. Iterating through these texts should therefore be independent from iterating through the different views of the vase. Finally, all three icons are placed within an outer query group so that she can link from one page to the next, with each page containing information on a different vase within the Harvard collection.
Our data model includes some attributes that are not currently part of the DTD. Most of these have been put into a new MDATA class for metadata attributes. Included here are attributes defined by the STARTS protocol for Internet retrieval and search [15]. Namely, the SRange attribute relates to the ScoreRange field, and lists the minimum and maximum query scores a document can get within a search engine, while the AlgID attribute relates to the RankingAlgorithmID field and identifies the ranking algorithm used for computing scores in that search engine. Once available, both of these attributes could be used for more effective merging of files retrieved by multiple search engines. The links attribute, also included in the STARTS protocol, is used for storing all links contained in a file. At the present time, its values are also not provided by search engines.
Other attributes that we added to the MDATA class are currently provided by search engines. These include length, for the length of the file, and moddate, for the last date of file modification. Both of these attributes are also supported by WebSQL [20], the query language into which our Web-destined queries are translated, as explained in Section 2.2.2.
The WebSQL classification of links as interior (within the same page), local (within the same site), or global (outside the current site) has also been added to our model under the A class, which is used for describing anchors. The base attribute tells the URL of the document containing the link, and the href attribute tells the URL of the target of the link.
Figure 6 shows a partial data schema representing the additions we have made to the existing DTD model. Sets of elements, such as the set of URLs represented as {URL}, indicate that zero or more such elements may be present. The symbol ``|" is used to represent an OR condition.
Figure 6: Search engine classes.
The user first specifies the repositories to be queried, so that the query interface can display the attributes, in scrolling lists, for that repository. The values of the select clause are partially specified during the layout specification process: when a text icon is added to a style sheet, Delaunayautomatically assigns an object identifier (oid) to it, such as ``Text1". The user must then select the text attribute to retrieve, such as ``title". In the case of an image or an audio recording, the file type to retrieve is specified, such as ``gif" for an image, or ``wav" for a recording.
To illustrate query formation and the grouping of queries, we will
continue with our Perseus example. The relational tables from
the Perseus database that are relevant to the queries that follow are
shown in Figure 7.
The image of a vase in the query is assigned the oid Image1
by the system, and the text box is assigned the oid Text1. Using
scrolling lists and dialog boxes, the user creates the query in
Figure 8.
If this were the only query defined for the page, then clicking on the query group's forward and backward links would result in the display of each vase in the Harvard collection along with its name.
By adding a separate query group containing a text box, the user is able to view all the descriptions for each vase. The query associated with this group is shown in Figure 9.
The last query group contains all of the icons defined for the page,
and encompasses the queries shown in the examples of
Figures 8 and 9. The user would like each page of the
document to contain information on one of the vases in the
Harvard collection. The query for the entire page is shown in
Figure 10. Each page of the document created from this
query refers to a different vase. Within any page, it is possible to
iterate through all of the different images of the vase and read the summary
information describing it.
In the next example, the user would like to view the two sides
(obverse and reverse) of each of the 523 Dewing coin images in the
Perseus database. The Images table has a Sequence attribute,
which is an ordered integer list of the different views stored for each
object. Two image icons are added to the style sheet
and placed within one query group. The query associated with
that group is shown in Figure 11. The result is a document
in which each page shows the two sides of every coin with images in
the database.
In posing queries to the Web, a particular URL can be specified from which to start the search. Attributes from the Delaunay Web file schema appear as selections within scrolling lists. Anchor attributes supporting the interior, local, and global categorizations found in [20] are also available for selection so that the types of links on which to navigate can be specified. Figure 12 shows a query that finds all images of George Washington connected by two or fewer local links to a particular URL, while Figure 13 shows that query as entered into the Delaunay query interface to WebSQL.
Figure 13: DelaunayMM query interface to WebSQL.
If the user does not know the starting location for the above query,
then a keyword search is needed. All the images
connected by two or fewer local links to a Web document containing the
keywords ``George Washington" are specified.
This query is shown in Figure 14.
Note that the where clause further specifies that the keywords appear
in the title (as opposed to, say, anywhere in the document).
Figure 15: Architecture of DelaunayMM.
The Query Processing component is responsible for (1) mapping the schemes of the underlying data repositories to an object-oriented representation for use by the Query Formation component, (2) formatting queries from the Query Formation component into the syntax recognized by their destinations and then executing them, (3) sorting and merging the results of queries, and (4) passing those results on to the Virtual Document Generation component. There, the user-specified layouts are combined with the processed data to form the completed document.
After the queries that define a document have been
[5]
formed, they are sent to the Query Processing component for
translation into a syntax recognized by the query destination. In the
case of Perseus, that syntax is SQL. In the case of the Web, queries
are translated into WebSQL and then executed by the WebSQL server.
The files returned in this latter case go through an additional
selection process in which attributes not defined for querying within
WebSQL are evaluated. For example, a user might only want a document
if a particular phrase appears in one of its headings (as in the query
of Figure 14 where the document's title must contain
``George Washington''), believing that phrase to be more strongly
associated with that document than with one in which the phrase only
appears in the document's body. This kind of selection is not
performed by the WebSQL server. Therefore, we need to parse the HTML
documents that have been returned by the WebSQL query and select only
those where the phrase appears in the title.
Next, the retrieved data must be merged on the basis of page content as defined by the queries associated with each page. For example, after executing the queries to form a book of vases from the Harvard collection described in Section 2.2.2, the images are matched up with the name of the vase and the text describing them. Figure 16
Figure 16: Structured map instance.
shows the content, by means of a structured map [12], of the page for the Harvard 1895.247 vase. There are three image-name pairs for this page (the name is required to be shown with each image by the Harvard Museum), and one Decoration_Description.
The Layout Specification component provides the front-end interface through which users define how to present the data to be retrieved. The tool box, as shown in Figure 17, provides icons for adding multimedia element representations to each style sheet.
Figure 17: Tool window from DelaunayMM.
The first five buttons in the top row are used for adding text, images, video, audio, and label elements, in that order, to a style sheet. Once an element has been added, double-clicking on its representation brings up its presentation attributes. The sixth button in that row is used for attaching queries to page elements.
In the second row, the first two buttons are for adding length and overlap constraints. Length constraints can be added between system-defined locations, called ``landmarks", on the borders of elements. For example, a distance specification can be set between the center of an image element and the center of a text element by adding a length constraint between those two landmarks. Alternatively, the user can add user-defined location markers called ``anchorpoints" to elements by clicking on the third button in this row. This makes it possible to specify length constraints between any two points on two elements, such as the upper left corner of one image and the lower right corner of another. The fourth button is used for viewing and organizing the overall layout of a virtual document, while the fifth button adds a page border that is used for defining page attributes and in setting constraints between the borders of a page and the elements that fall within it. Finally, the sixth button is for a snap-to-grid option. Standard editing functions (e.g., copy, move, delete, and select all) are available from the pull-down menu labeled ``Edit".
Figure 18 shows the style sheets window, which contains two templates called ``chapters" and ``body" that have been created for the virtual document on vases. In the ``body" style sheet, the blue line running vertically through the center is a length constraint used for defining the overall size of each page relative to its contents. After a virtual document has been generated, its pages are displayed in a thumbnail view similar in layout to the style sheets window. Clicking on a page makes it the active view.
Figure 18: Style sheet window from DelaunayMM.
Two parallel efforts are being pursued at this time in terms of interfacing to distributed data repositories. One of these corresponds to queries destined for the Web. Our query interface translates queries into WebSQL, checks for correctness, and sends them to the WebSQL server for processing. The other effort is focused on the Perseus Project. A data wrapper for the Perseus database is currently under development.
While in [2,28] documents are generated from a set of
known objects, our approach is designed with external datasets,
including the Web, in mind. The work by
[5] Weitzman and Wittenburg was
an important source of inspiration for the current work. As for the
expressiveness of the spatial layout, the work by Bertino et
al. is quite similar to our former
work [6,7], but differs from it in that it is
based on the relational data model.
However, they also consider temporal constraints, which we have not
yet incorporated into Delaunay.
In addition, we offer a visual approach that spans from the laying out of
the content of individual viewable pages to the modification of
features and page orderings found in the completed virtual document.
Other related work includes Garlic [3], DISCO
[5](Distributed
Information Search COmponents) [27], and
InfoHarness [25] for querying heterogeneous distributed
databases. The first of these approaches differs from ours in that
they query one database at a time and do not try to integrate data
obtained from a variety of sources. While the second approach does
incorporate these features, it does not focus on multimedia data and leaves
the presentation of retrieved data up to applications programmers. The
InfoHarness system uses metadata extraction methods to create information
repositories that support run-time access to the original information.
While our system incorporates retrieved multimedia objects into user-defined
presentations, the information retrieved by InfoHarness
is converted to a system-generated combination of HTML forms and hyperlinks,
which are then viewed using the Mosaic browser.
The system developed at Xerox PARC [23] uses a variety of 3D displays and integrates an algorithm for the effective browsing of a large collection of documents. Two important differences are our emphasis on user-defined layouts and the availability of our interface over the Web. We have also elected to use 2D displays for faster prototyping and easier access over the Web.
The work by Hüser et al. [16] is directed to the generation of documents on the fly. Although this work is intended for the visualization of a single information repository, its presentation objectives are remarkably similar to ours. An interesting difference is that they do not assume pre-defined templates while we have done so, mainly with the objective of simplifying the user's interaction. Using Delaunay, the more sophisticated user can, however, achieve similar functionality by using visual rules to shape the layout of the virtual documents [11].
In the future, we will expand our access to data repositories other than the Web and Perseus. Examples of other data wrappers and repositories include Garlic [3], QBIC [14], and DISCO [27]. QBIC will allow us to test our ideas on querying images using attributes that are not of type string.
While we have an expressive framework for the specification of spatial layout [10] we have not yet addressed the temporal layout of multimedia components within the virtual documents that are user specified (see for example [19,29]).
We also plan on conducting usability studies, which are of particular importance to applications intended for a large variety of users. Although the user interface is based on the one in [9], it must support a host of new features related to multimedia data types and distributed data. Our first users are the members of the Perseus Project with whom we have been cooperating. While they fulfill the role of the users who are digital librarians, we would also like to have an experimental site available to the casual users of the Perseus site [5]. Given the popularity of this site, we believe that it would be an ideal testbed for our ideas.
This document was generated using the LaTeX2HTML translator Version 97.1 (release) (July 13th, 1997)
Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 mm97html.tex.
The translation was initiated by Isabel Cruz on 8/14/1997