This issue of the Journal of Signal Processing Systems is devoted to recent innovations and developments in the field of leading edge embedded signal processing systems. The eight articles cover various topics within this area, ranging from real-time image processing to heterogeneous computing to applications in biosciences.

In “FPGA-Based Vision Processing System for Automatic Online Player Tracking in Indoor Sports”, Ibraheem et al. present a reconfigurable system to automatically track players in indoor sports without any user interaction. The system connects directly to two video streams and operates in real-time. An FPGA-based implementation exhibits a 15-fold speedup over an efficient software version.

The authors of “3D Tomography Back-Projection Parallelization on Intel FPGAs Using OpenCL”, Mertelli et al., compare the implementation of OpenCL programs for computer tomography on GPU and FPGA. They conclude that GPUs still have the advantage in this particular application, but that the rise of FPGAs as software-defined accelerators is promising, because high-level synthesis from OpenCL code allows software developers to combine the natural data or task parallelism of some algorithms with a well-understood GPU-like memory model.

With the slow demise of Moore’s law, heterogeneous hardware platforms have appeared as a potential architectural approach to improved performance and energy efficiency. However, their widespread adoption is associated with many technical obstacles. The paper “Heterogeneous Computing Utilizing FPGAs” by Reichenbach et al. describes an IP core library that greatly simplifies the inclusion of domain-specific FPGA acceleration into systems that comply with the standards issued by the Heterogeneous System Architecture Foundation. The capabilities of the propose approach are illustrated by the development of a processor for low-level image algorithms.

The article by Lazcano et al., “Adaptation of an Iterative PCA to a Manycore Architecture for Hyperspectral Image Processing,” targets an architecture that supports massive parallelism. The authors study the acceleration of a non-linear iterative partial least squares algorithm on a manycore architecture with 256 cores. Their use case is drawn from the field of hyperspectral imaging and their focus is on optimizing the communications within the algorithm to achieve a competitive solution with a good balance between performance and energy efficiency.

Multi-core systems-on-chip present important challenges to the design of safety-critical real-time systems with predictable temporal behavior. The article by Vass et al., “Application-Specific Tailoring of Multi-Core SoCs for Real-Time Systems with Diverse Predictability Demands” contributes to this area by presenting a method to generate application-specific, deterministic multi-core architectures. They show how to significantly improve the temporal properties of tasks running on multi-core systems without compromising the overall performance for soft real-time scenarios.

Many embedded systems resort to fixed-point representations of numeric data, a fact that inevitably raises the problem of overflow. Kabi and Sahadevan address this type of issues in their paper “Range Analysis of Matrix Factorization Algorithms for an Overflow Free Fixed-point Design” by proposing a range estimation and scaling approach that preserves the properties of the analytical method. They successfully apply the approach to matrix factorization algorithms used for eigenvalue and singular value decomposition of matrices.

Najmabadi et al. present in their paper “An Architecture for Asymmetric Numeral Systems Entropy Decoder - A Comparison with a Canonical Huffman Decoder” the first hardware implementation of an asymmetric numeral systems entropy decoder. The underlying compression method is used to compress data sent between computers to increase transmission bandwidth. The architecture presented in this paper achieves a significant throughput improvement over hardware decoders for Huffman codes.

High-performance embedded systems have the potential to have a profound impact on many other fields. A striking example is provided by the work of Lieske at al reported in the article “Embedded Fluorescence Lifetime Determination for High-Throughput, Low-Photon-Number Applications.” The authors show how to improve the implementation of time-resolved fluorescence analysis, an important method in biochemistry and biophysics. Their system, based on a low cost FPGA, provides fluorescence lifetime estimations of superior quality in real-time, improving on both quality and throughput over previous approaches.

Collectively, these eight papers illustrate a diverse range of issues being addressed today in the design and development of leading edge embedded systems for signal and image processing, while pointing out some of the challenges that still await us.