Embedded systems with multi-core designs are becoming increasingly important for signal processing in recent years. While embedded multi-core system design can significantly improve efficiency for signal processing systems, several critical issues from the hardware and software perspectives of embedded multi-core systems need to be carefully designed for different applications in signal processing. Some of the most important issues are the hardware architecture design, software tools, programming models, and algorithm parallelization for embedded multi-core computing. Usually, the system optimization for the above issues is closely related to the targeted application domain. This special issue focuses on the latest development and technical solutions concerning the multi-core embedded computing for signal processing from the hardware and software perspectives.

This special issue consists of six papers related to the multi-core embedded computing for signal processing systems. They can be partitioned into three categories: hardware architecture design, system and programming tools, and application-specific algorithm parallelization. Firstly, the paper “EVE: A Flexible SIMD Coprocessor for Embedded Vision Applications”, by Sankaran et al., is focused on the hardware architecture design for embedded vision applications. In this paper, they present a flexible SIMD architecture that is optimized for embedded vision applications. The proposed embedded vision co-processor is efficient in power and area consumption, and it can accelerate many low-level and mid-level vision tasks.

There are three papers related to the system and programming tools for multi-core embedded computing in this special issue. For the paper titled “C++ Support and Applications for Embedded Multicore DSP Systems”, authored by Kuan et al., they propose a layered design to provide a code size aware C++ library support. This work provides C++ programming support to enhance low-level programming APIs that can be used to exploit DSPs, SIMD instructions, and DMAs on embedded multicore systems. They evaluate their C++ support with image blurring and JPEG compression tasks and show significant computational speed-up.

Moreover, the paper “Message-Passing Programming for Embedded Multicore Signal-Processing Platforms”, by Hung et al., presents a light-weight MPI-like message-passing library with a three-layer modular design, which supports message-passing on several popular embedded multi-core signal-processing platforms. In this paper, they present a message-passing library for inter-core communications on different multi-core computing platforms to provide a standard, portable and efficient message-passing library for embedded multicore platforms.

For the third paper in the category of system and programming tools, “Design Issues in a Performance Monitor for Multi-core Embedded Systems”, authored by Lin et al., presents a multi-core performance monitor and they evaluate the effects of monitor overheads for difference types of tasks, including CPU-bound and IO-bound tasks. This paper proposes an adaptive performance monitoring mechanism to reduce the impact of the monitoring overhead on the application, without sacrificing accuracy or immediacy of the monitored information. They show experimental results with different monitoring periods for a digital recording system.

We have two papers in the category of parallelization of signal processing algorithms on embedded multi-core systems. For the first paper in this category, the paper “Automatic Facial Expression Exaggeration System with Parallelized Implementation on a Multi-Core Embedded Computing”, authored by Su et al., proposes parallelized algorithms on a multi-core embedded system for an automatic facial expression exaggeration system, which includes face detection, facial expression recognition, and facial expression exaggeration components. They show significant speed-up by applying the proposed algorithm on a multi-core embedded system. This multi-core system, containing an ARM9 processor and two PACDSP cores, is designed to demonstrate its outstanding performance and energy efficiency for multimedia processing.

The last paper in this special issue, titled “Parallelization of Connected-Component Labeling on TILE64 Many-Core Platform” and authored by Chen et al., proposes a parallel linear-time two-scan algorithm for labeling connected components in grayscale-level images on a TILE64 many-core computing platform. Different parallelization schemes are developed for the two scans of the connected component labeling algorithm. They can achieve about ten times computational speed-up when 32 processor units are activated in the TILE64 many-core system.

The above six papers in this special issue cover different aspects of multi-core embedded computing for signal processing. These papers provide frontier information related to the hardware architecture design, software development tools, and algorithm parallelization in the embedded multi-core computing for signal processing systems.