Advances in IC devices and technologies drive the need for innovation in design and implementation techniques for signal processing systems. Emerging signal processing algorithms place large demands on processing complexity, which must be met under strict area and power limits. To meet these challenges, more efficient algorithms and architectures are needed.

This special issue contains a selection of papers on recent advances in the design and implementation of signal processing systems in the areas of wireless communications, phase-change memory, real-time vision, and error-correcting codes.

In GPU Acceleration of a Configurable N-Way MIMO Detector for Wireless Systems Wu, Yin, Wang, Studer, and Cavallaro present an N-way MIMO detector for addressing the computational complexity of conventional detectors. Their implementations on practical GPUs highlight opportunities for parallelism, which in turn enable scalability in the detector algorithms, making energy and performance tradeoffs readily accessible. Even with the configurability, their implementations outperform typical GPU-based MIMO detectors.

In DIFFS: A Low Power, Multi-mode, Multi-standard Flexible Digital Front-end for Sensing in Future Cognitive Radios, Chiumento, Hollevoet, Pollin, Naessens, Dejonghe, and Van der Perre present the idea of a configurable digital front-end, capable of performing efficient spectrum sensing and synchronization to selectively trigger digital baseband functions in future cognitive radios. The authors propose a design that can support multiple modes and standards (LTE, WLAN, DVB-T), and they demonstrate an implementation in a custom IC in 65 nm CMOS.

In Energy-adaptive Pulse Amplitude Modulation for IR-UWB Communications under Renewable Energy, Zhao, Chen, and Wang present an energy adaptive Pulse Amplitude Modulation technique for self-powered Impulse Radio Ultra-wideband (IR-UWB) communication systems. By jointly exploiting the wireless channel conditions and the non-deterministic characteristics of renewable energy sources, the proposed technique can effectively improve the communication system data rate and/or time coverage.

Yang, Emre, Xu, Chen, Cao, and Chakrabarti present a cost-effective solution for improving the reliability of multi-bit per cell phase change RAM (PRAM) in A Low Cost Multi-Tiered Approach To Improving The Reliability Of Multi-Level Cell PRAM. Error models are first developed to accurately capture hard and soft errors in multi-bit per cell PRAM, based upon which a variety design techniques across circuit, architecture, and system are developed to improve the data storage reliability at minimal speed, silicon, and energy overhead. The effectiveness is well demonstrated through comprehensive evaluations.

In Accelerating Multiresolution Gabor Feature Extraction for Real Time Vision Applications Cho, Chandramoorthy, Irick, and Narayanan propose an accelerator architecture for Gabor feature extraction that maximizes the utilization of resources on FPGAs. With the increasing utility of Gabor filter banks in embedded vision applications, their design targets increased performance and power efficiency, achieving 6× speed up and over an order of magnitude higher throughput per Watt compared to a GPU implementation for processing 2048 × 1526 images.

In Image Blending in a High Frame Rate FPGA-based Multi-Camera System, Popovic, Seyid, Akin, Cogal, Afshari, Schmid, and Leblebici present a real-time implementation of image blending for the reconstruction of images in a multi-camera panoptic system. The authors propose resource-efficient blending algorithms suitable for FPGA implementation. The implemented hardware achieves real-time frame rates and has the lowest power consumption when compared to other multi-camera systems.

In Clockless Stochastic Decoding of Low-Density Parity-Check Codes: Architecture and Simulation Model, Onizawa, Gross, Hanyu, and Gaudet present a high-throughput and low error floor LDPC decoding solution based upon clockless stochastic computation. They propose a decoder architecture that completely eliminates any global and local clock signals in order to achieve very high decoding throughput and reduce silicon cost. A timing model is presented to accurately verify the decoding BER performance, and results show that the clockless stochastic decoder can realize lower error floor than conventional synchronous decoders.

In Efficient Error Control Decoder Architectures for Noncoherent Random Linear Network Coding, Lin, Xie, and Yan present architectures for hardware implementation of error-correction decoders for random linear network coding. They propose hardware-friendly factorization algorithms and implement serial and unfolded architectures. Synthesis results show that these architectures are more efficient than previous implementations for Koetter-Kschischang codes, and show the feasibility of implementing decoders for Mahdavifar–Vardy codes.

Cai and Zhang present check-node architectures for non-binary LDPC decoders in their paper: Efficient Check Node Processing Architectures for Non-binary LDPC Decoding Using Power Representation. They propose using the power representation of the field elements, which results in efficient architectures for check nodes; the most complex component of non-binary LDPC decoders. The area of a check node using the simplified min-sum algorithm, for a code over GF(32) when synthesized in 90 nm CMOS is 10 % of the area of previous designs.

We would like to thank the authors of the papers in this special issue for contributing their work and also to the reviewers for their diligent work in the anonymous review process. We extend thanks to Courtney Clark for her help in putting together this special issue. We hope you will find the articles informative and that you will enjoy reading this special issue.