The performance requirements of image processing applications have continuously increased, especially when they are executed under real-time constraints. We have organized this special issue on parallel computing for real-time image processing to present the current state-of-the-art in the field of parallel programming and the future trends in real-time image and video processing as related to parallel computing or real-time implementation of embedded image processing applications on parallel architectures including multi-core platforms, GPUs and dedicated parallel architectures.

Due to the overwhelming number and their wide scope of submissions received for this special issue and thus the difficulty associated with finding expert reviewers, we have decided to offer this special issue in two or possibly three parts to meet the planned appearance of this special issue. The papers that are currently under review will appear in a second or a third part of this special issue immediately after their reviews and re-reviews are concluded. We are very grateful to the reviewers who provided valuable comments and suggestions to improve the quality of the accepted papers.

This first part of the special issue on parallel computing for real-time image processing presents articles addressing GPU/Multi-GPU programming and is dedicated to parallel architectures based on FPGA and/or CMOS VLSI technologies for real-time image processing applications. Six papers appear in this first part. Brief outlines of these papers are stated below:

The first paper by Anis Rahman, Dominique Houzet, Denis Pellerin, Sophie Marat and Nathalie Guyader describes a parallel implementation of a spatio-temporal visual saliency model with multi-GPU reaching a real-time throughput. This article focuses on the algorithms of this model and details several parallel optimizations.

The second paper by Fernanda Palhano, Guillermo Andrade-Barroso and Pierre Hellier presents a method for real-time denoising of ultrasound images discussing a modified version of the NL-means method that incorporates an ultrasound dedicated noise model. It addresses a GPU implementation of this algorithm. Results demonstrate that the proposed method is quite efficient in terms of denoising quality and real-time performance.

The third paper by Harald Jordan, Walter van Dyck and Rene Smodic describes a new approach for a contour-tracking algorithm targeting a low power smart camera for industrial inspection. This embedded system consists of the three major components: CMOS sensor, FPGA and microprocessor.

The fourth paper by Mathieu Thevenin, Michel Paindavoine, Laurent Letellier, Renaud Schmit, and Barthelemy Heyrman describes the eISP, a programmable processing architecture that combines enough computational efficiency for 1080p HD video with silicon area and power characteristics suitable for the next generation of mobile phones (lower than $1 mm² and 500 mW in TSMC 65 nm technology).

The fifth paper by Loic Sieler, Lionel Damez, Landrault Alexis and Jean-Pierre Derutin presents the parallelization and the embedding of a real-time image stabilization algorithm on SoPC platform. The overall hardware implementation method is based upon meeting algorithm processing power requirement and communication needs with refinement of a generic parallel architecture model. It also presents both software and hardware implementation with performance results on a Xilinx SoPC target.

The sixth paper by Jérôme Gorin, Matthieu Wipliez, Françoise Prêteux and Mickael Raulet presents a low level virtual machine (LLVM)-based and scalable MPEG-RVC decoder that is a new mechanism based on the LLVM that allows the design of a decoder capable of dynamically instantiating several RVC decoder descriptions. This decoder, unlike static decoders generated by RVC tools, keeps de facto the features of an RVC description namely portability, scalability and reconfigurability.