The overarching goal of the pattern recognition community consists of presenting hypotheses to describe classes of objects using mathematical models, processing the information to eliminate the presence of noise, and selecting the model that best explains the given observations; nevertheless, it does not prioritize in memory and time complexity when matching models to observations. Given that we describe, explain and manipulate these objects through the perceptual system, there is an increasing need to favor those pattern recognition techniques that can explain, process and predict large volumes of visual data in real-time. Such techniques cannot be developed “in vitro” due to the physical constraints of the complex environment and the context in which these techniques are used. Further, these new methods need to achieve high detection, classification and recognition accuracies in real-time even when these are conflicting objectives. To make pattern recognition techniques viable for practical applications (such as surveillance, robotics and medical applications), considerations such as computational complexity reduction, hardware implementation, software optimization, and strategies for parallelizing solutions must be observed.

This Special Issue of the Springer Journal of Real Time Image Processing entitled “Real-Time Image and Video Processing for Pattern Recognition Systems and Applications” is dedicated to Methods and Tools, Architectures, Platforms and Technologies, User Centered Case-Studies and Applications, and Theoretical Foundations that facilitates real-time image processing aided by fundamental pattern recognition methods.

This Special Issue (SI) is oriented toward both theoretical and practical research and following the main theme of the journal, which is real time performance. Together with the contribution of the papers, trade-offs and future steps are discussed thoroughly in this SI.

The call for papers resulted in 21 submissions. At least two reviewers assessed the quality of the papers, one guest editor and one editor-in-chief, and those meeting the top standards were sent to a second round of reviews. Finally, 11 papers were selected for publication.

The papers discussed below can be divided broadly into the following thrusts. The first, concerns real-time tracking and motion estimation, and included five papers. The second thrust involves robotics real-time navigation and contains two papers. The third thrust has two papers and is about real-time image and video processing. The last two papers belong to the last thrust, involving medical application. The papers are summarized below:

In the paper of the special issue, Q. Gu et al. entitled “High frame-rate tracking of multiple color-patterned objects” the authors present a high frame-rate vision system capable of tracking multiple color-patterned objects, based on color histogram-based models. Tracking is achieved by implementing an expanded cell based labeling algorithm as the hardware logic. The hardware implementation of the expanded cell-based labeling algorithm consists of building hue- color histograms of the objects of interest in an image, extracting statistical features (e.g. position, area, orientation) and using this information for tracking.

The paper “A computationally efficient tracker with direct appearance kinematic measure and adaptive Kalman filter” by R. Ben-Ari and O. Ben-Shahar, presents motion tracking in real-time using low computational resources. The paper suggests a method capable of tracking in colour video with great robustness and speed. The main authors’ contribution is employing a novel similarity measure that explicitly combines appearance with object kinematics and a new adaptive Kalman filter.

The paper, by authors L. Alvarez et al. entitled, “Real-Time Camera Motion Tracking in Planar View Scenarios” deals with the challenging problem of real-time camera calibration in real planar scenarios. Such planar scenarios may include a soccer stadium or a tennis court. The application shown in this work is intended for inserting virtual content in HD (High Definition) videos of broadcasted soccer matches.

The proposed method benefits from the imposed tripod geometry restrictions, which strongly simplifies the camera motion and uses CART (Classification and Regression Tree) method to extract, in real time, the image primitives.

In the paper “An optimized real-time hands gesture recognition based interface for individuals with upper-level spinal cord injuries” by H. Jiang and J. Wachs, an innovative real-time technique is presented for tracking hand movements in the context of a human-robot interface. Such an interface is meant to be used to conduct “hands-on” laboratory tasks for individuals with upper-level spinal cord injuries.

Tracking is accomplished by a 3D particle filter framework based on color and depth information. Spatial and motion information were integrated into the particle filter framework to tackle the “false merge” and “false labeling” problems through hand interaction and occlusion. Once the hands are tracked successfully, the resulting hand gesture trajectories are classified with pre-trained motion models.

In the paper by J. Ahmed et al. entitled “Stabilized active camera tracking system” authors propose a real time camera tracking system based on the combination of visual tracking, pan-tilt control, and digital video stabilization algorithms. It exploits the coordinates of the target, computed by the tracker module, to estimate the amount of vibration and then filters it out of the video in the case the system is mounted in a vibratory platform (e.g.., vehicle, helicopter, etc.).

The paper by De Cristoforis et al. entitled “Real-time monocular image based path detection” brings pattern recognition techniques to robotic navigation. The authors developed a real-time image based monocular path detection method that does not require camera calibration and works on semi-structured outdoor paths. Segmentation is achieved by classifying super-pixels into regions used to infer a contour of navigable space.

Image segmentation is implemented on a low-power embedded GPU, which delivers the necessary computation power for on-board execution in mobile robots.

Within the context of robotics, the paper “Accelerating embedded image processing for real time: a case study”, S. Padre et al., proposed a methodology to achieve real-time embedded solutions using hardware acceleration in FPGA-based chips.

The methodology is applied to a novel algorithm for multiple robot localization in global vision systems developed to work reliably 24/7 and to detect the robot’s positions and headings even in the presence of partial occlusions and varying lighting conditions. The model is expressed through OOP Design in UML and implemented in C ++ using multithreaded programming.

The paper by Fernandez et al. entitled “Performance of dynamic texture segmentation using GPU”, presents a CPU and GPU implementations of a motion segmentation algorithm. This algorithm relies on dynamic texture segmentation under the mixture of dynamic textures (MDT) model.

The work presented discusses how matrix inversion, as part of the MDT algorithm can be optimized by porting to GPU this matrix inversion process, and how real-time motion segmentation performance can be affected by separating the learning part of the algorithm from the segmentation part.

The paper “Fast mode decision algorithm for H.264/SVC enhancement layer” by A. Kessentini et al. presents a fast mode decision algorithm to speed-up the SVC encoder. Scalable video coding (SVC) is a standard extension for the H.264 advanced video coding (AVC) video compression technique, which builds on an exhaustive mode decision process based on the base layer mode predictions.

The fast mode decision algorithm presented for intra- and inter-frame coding in SVC, relies on statistical observations of the distribution mode and the correlation between the base layer (BL) and its enhancement layers (ELs). Two versions are proposed to tackle both reduction in encoding time and tolerable loss in video quality.

The last two papers present applications in the medical domain and compression technology.

In the paper “A novel medical image compression using Ripplet transform” by S. Juliet et al., a video compression method is presented for medical images using the Ripplet transform. The main feature of the method discussed in the paper is high quality compressed images by using multi-scale/multi-orientation approach.

The innovation of the method discussed is the use of Ripplet transform with anisotropy for singularities and arbitrarily shaped curves representation. This method was experimentally compared with conventional and state-of-art compression methods, in terms of Signal to Noise Ratio and compression ratio.

In the last paper “Fast retinal vessel analysis” M. Krause et al. a fast image processing system for digital data-bases of retinal images analysis is presented. In this method, retinal blood vessels are enhanced via convolution with the second derivative of the local Radon kernel.

Preprocessing steps, such as smoothing, contrast enhancement, connected components and skeletonization are exploited towards higher-level representation of the vessel tree. Speed-up is achieved by implementing these steps on a GPU using CUDA, leading to a very fast system for in situ ocular examination.

The guest editors would like thank the authors, reviewers, the editorial staff at Springer and the Journal of Real-time image processing chief editors for making this SI a reality. We hope that this SI will be of long-lasting value to the image processing and pattern recognition community.