Abstract
Digital predistortion (DPD) is a widely adopted baseband processing technique in current radio transmitters. While DPD can effectively suppress unwanted spurious spectrum emissions stemming from imperfections of analog RF and baseband electronics, it also introduces extra processing complexity and poses challenges on efficient and flexible implementations, especially for mobile cellular transmitters, considering their limited computing power compared to basestations. In this paper, we present high data rate implementations of broadband DPD on modern embedded processors, such as mobile GPU and multicore CPU, by taking advantage of emerging parallel computing techniques for exploiting their computing resources. We further verify the suppression effect of DPD experimentally on real radio hardware platforms. Performance evaluation results of our DPD design demonstrate the high efficacy of modern general purpose mobile processors on accelerating DPD processing for a mobile transmitter.
Similar content being viewed by others
References
Mak, P.-I., U, S.-P., & Martins, R.P. (2007). Transceiver architecture selection: review, state-of-the-art survey and case study. IEEE Circuits and Systems Magazine, 7(2), 6–25.
Larsson, E., Edfors, O., Tufvesson, F., & Marzetta, T. (2014). Massive MIMO for next generation wireless systems. IEEE Communications Magazine, 52(2), 186–195.
Dahlman, E., Parkvall, S., & Skold, J. (2011). 4G LTE/LTE-advanced for mobile broadband.
Haykin, S. (2005). Cognitive radio: brain-empowered wireless communications. IEEE Journal on Selected Areas in Communications, 23(2), 201–220.
Lehtinen, V., Lahteensuo, T., Vasenkari, P., Piipponen, A., & Valkama, M. (2013). Gating factor analysis of maximum power reduction in multicluster lte-a uplink transmission, in. IEEE Radio and Wireless Symposium (RWS), 2013, 151–153.
Kim, J., & Konstantinou, K. (2001). Digital predistortion of wideband signals based on power amplifier model with memory. Electronics Letters, 37(23), 1–2.
Anttila, L., Handel, P., & Valkama, M. (2010). Joint mitigation of power amplifier and I/Q modulator impairments in broadband direct-conversion transmitters. IEEE Transactions on Microwave Theory and Techniques, 58(4), 730–739.
Kim, Y.D., Jeong, E.R., & Lee, Y.H. (2007). Adaptive compensation for power amplifier nonlinearity in the presence of quadrature modulation/demodulation errors. IEEE Transactions on Signal Processing, 55(9), 4717–4721.
Wolf, M. (2014). High-performance embedded computing: applications in cyber-physical systems and mobile computing. Newnes.
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., & Phillips, J.C. (2008). GPU computing. Proceedings of the IEEE, 96(5), 879–899.
Wang, G., Xiong, Y., Yun, J., & Cavallaro, J.R. (2013). Accelerating computer vision algorithms using opencl framework on the mobile gpu - a case study. In IEEE International conference on acoustics, speech and signal processing (pp. 2629–2633).
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: convolutional architecture for fast feature embedding, Proceedings of the 22Nd ACM international conference on multimedia, MM ’14 (pp. 675–678). New York: ACM.
Li, K., Wu, M., Wang, G., & Cavallaro, J.R. (2014). A high performance GPU-based software-defined basestation. In 48th IEEE Asilomar conference on signals, systems, and computers (ASILOMAR).
Li, K., Yin, B., Wu, M., Cavallaro, J.R., & Studer, C. (2015). Accelerating massive MIMO uplink detection on GPU for SDR systems. In 2015 IEEE Dallas on circuits and systems conference (DCAS) (pp. 1–4).
Nvidia CUDA tookit documentation. http://docs.nvidia.com/cuda.
The open standard for parallel programming of heterogeneous systems, https://www.khronos.org/opencl/.
Abdelaziz, M., Tarver, C., Li, K., Anttila, L., Martinez, R., Valkama, M., & Cavallaro, J.R. (2015). Sub-band digital predistortion for noncontiguous transmissions: algorithm development and real-time prototype implementation. In 2015 49th Asilomar conference on signals, systems and computers (pp. 1180–1186).
Ghazi, A., Boutellier, J., Abdelaziz, M., Xiaojia, L., Anttila, L., Cavallaro, J.R., Bhattacharyya, S.S., Valkama, M., & Juntti, M. (2014). Low power implementation of digital predistortion filter on a heterogeneous application specific multiprocessor. In IEEE International conference on acoustics, speech and signal processing (ICASSP) (pp. 8336–8340).
Li, K., Ghazi, A., Boutellier, J., Abdelaziz, M., Anttila, L., Juntti, M., Valkama, M., & Cavallaro, J R. (2015). Mobile GPU accelerated digital predistortion on a software-defined mobile transmitter. In 2015 IEEE Global conference on signal and information processing (GlobalSIP) (pp. 756–760).
Ghazi, A., Boutellier, J., Anttila, L., Juntti, M., & Valkama, M. (2015). Data-parallel implementation of reconfigurable digital predistortion on a mobile gpu. In 2015 49th Asilomar conference on signals, systems and computers (pp. 186–191).
ARM NEON technology, http://www.arm.com/products/processors/technologies/neon.php.
Nvidia Jetson TK1, http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html.
Nvidia Jetson TX1, http://www.nvidia.com/object/jetson-tx1-module.html.
WARP Project, http://warpproject.org/trac/.
Raich, R., & Zhou, G.T. (2004). Orthogonal polynomials for complex gaussian processes. IEEE Transactions on Signal Processing, 52(10), 2788–2797.
Changsoo, E., & Powers, E.J. (1997). A new Volterra predistorter based on the indirect learning architecture. IEEE Transactions on Signal Processing, 45(1), 223–227.
Warp shuffle, https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-kepler-shuffle/.
Jetson performance tuning, http://elinux.org/Jetson/Performance/.
Nikolskiy, V.P., Stegailov, V.V., & Vecher, V.S. (2016). Efficiency of the Tegra K1 and X1 systems-on-chip for classical molecular dynamics. In 2016 International conference on high performance computing simulation (HPCS) (pp. 682–689).
Stokke, K.R., Stensland, H.K., Griwodz, C., & Halvorsen, P. (2016). A High-precision, Hybrid GPU, CPU and RAM power model for generic multimedia workloads. In Proceedings of the 7th International conference on multimedia systems, MMSys ’16 (pp. 14:1–14:12). New York: ACM.
Acknowledgments
This work was supported by the US NSF under grants EECS-1408370, CNS-1265332, ECCS-1232274, and the Finnish Agency of Innovation, Tekes.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, K., Ghazi, A., Tarver, C. et al. Parallel Digital Predistortion Design on Mobile GPU and Embedded Multicore CPU for Mobile Transmitters. J Sign Process Syst 89, 417–430 (2017). https://doi.org/10.1007/s11265-017-1233-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-017-1233-y