1 Editorial process

We solicited papers from a variety of research groups active in the area of interactive multimedia. After a rigorous reviewing process, in which each paper was reviewed multiple times by multiple reviewers, we selected seven for inclusion in this special issue.

2 Selected papers on interactive multimedia

The seven papers selected discuss issues and present contributions related to managing and manipulating digital photographs, adapting multimedia content to mobile devices, and supporting interactive television.

Ryu, Chung, and Cho’s paper is titled “A Hierarchical Photo Visualization System Emphasizing Temporal and Color-based Coherences” and presents a novel user interface for visualizing photo collections. Their system presents photos on a grid that is organized both by the time at which the photos were taken and by color metrics, rather than using the more obvious graph presentation [5]. Groups of similar photos are represented on the grid by a single representative example. The system was evaluated both with objective metrics and user testing.

A second paper addressing the digital photo domain is “Image Matting Through a Web Browser” by Lin, Wang, and Hsieh. Their NIM 2.0 system is used to specify the separation between foreground and background in a photo, usually for the purposes of placing the foreground material in some other context. NIM is the first browser-based matting application and they show that it performs high-quality matting rapidly. This work differs from research on foreground/background separation [1] because image matting is designed to extract individual foreground elements.

Scalable video coding techniques work by reducing some aspect of video quality in order to reduce the bandwidth requirements of video streams. Daronco, Roesler, Valdeni de Lima, and Balbinot present the results of experimental user studies of scalable coding in “Quality Analysis of Scalable Video Coding on Unstable Transmissions”. They compared scaling based on spatial, temporal, and quality changes and also looked at whether varying the level of scaling affected user assessments. Using widely accepted metrics [3], they show that temporal scaling is considered less satisfactory and that any instability in the level of scaling also reduced user assessments.

In an invited paper titled “Communicating and Migratable Interactive Multimedia Documents”, Concolato, Dufourd, Le Feuvre, Park, and Song examine how to make interactive multimedia applications suitable for dynamic migration between devices. This research is important because of the rapid adoption of sophisticated mobile devices with multimedia capabilities. As people change location, they may well change devices, too, and their applications and application state must follow them as they move. Other proposed solutions are based on widget-to-widget communication [6] or a shared or central JavaScript context [4], but this one enables cross-device exchanges more effectively.

In other research addressing mobile devices, Deigmoeller, Takebumi, Just, and Stoll consider “Contextual Cropping and Scaling of TV Productions”. They are interested in adapting video that was produced for large screens to the small format screens found on mobile devices. They focus on sports productions and their system tries to automatically scale or crop the video images in order to focus on the central action in the sports event. Their system makes better use of metadata than some previous systems [8] and is less content-specific than others [2].

In the paper “XTemplate 3.0: Spatio-Temporal Semantics and Structure Reuse for Hypermedia Compositions”, Ferreira dos Santos and Muchaluat-Saade describe a template language for hypermedia documents. XTemplate is intended to ease the creation of complex interactive applications such as might be created for interactive television systems. XTemplate has been applied to the declarative Nested Context Language 3.0 [7] in order to provide stronger semantics and facilitate specification reuse.

In the paper “Discrimination of media moments and media intervals: sticker-based watch-and-comment annotation”, Texeira, Mello, Freitas, Santos and Pimentel describe an approach for annotating interactive broadcast video with “media stickers”. They define a set of operators that can be used to apply media stickers, which are predefined media elements. Users apply the stickers in order to identify moments of interest in video programs for later review by themselves or others. A small user test is presented as an initial evaluation. This is another example of work building on the Nested Context Language [7].

3 Final thoughts

Two points should be clear from the work presented in this special issue. First, even in a well-studied area like digital still images, there is considerable room of innovative research. Many natural editing operations are still not well supported and image collections get harder to manage as they expand with the ease of taking photographs. Second, new devices and technologies always create opportunities for researchers. Mobile devices and upcoming interactive TV systems are the examples seen here.