Advertisement

RISC-V Based MPSoC Design Exploration for FPGAs: Area, Power and Performance

  • Muhammad AliEmail author
  • Pedram Amini Rad
  • Diana Göhringer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12083)

Abstract

Modern image processing applications, like object detection or image segmentation, require high computation and have high memory requirements. For ASIC-/FPGA-based architectures, hardware accelerators are a promising solution, but they lack flexibility and programmability. To fulfill flexibility, computational and memory intensive characteristics of these applications in embedded systems, we propose a modular and flexible RISC-V based MPSoC architecture on Xilinx Zynq Ultrascale+ MPSoC. The proposed architecture can be ported to other Xilinx FPGAs. Two neural networks (Lenet-5 and Cifar-10 example) were used as test applications to evaluate the proposed MPSoC architectures. To increase the performance and efficiency, different optimization techniques were adapted on the MPSoC and results were evaluated. 16-bit fixed-point parameters were used to have a compression of 50% in data size and algorithms were parallelized and mapped on the proposed MPSoC to achieve higher performance. A 4x parallelization of a NN algorithm on the proposed MPSoC resulted in 3.96x speed up and consumed 3.61x less energy as compared to a single soft-core processor setup.

Keywords

MPSoC NoC RISC-V FPGA SoC Power estimation 

Notes

Acknowledgments

This work has been funded partially by the German Federal Ministry of Education and Research BMBF as part of the PARIS project under grant agreement number 16ES0657 and partially by COllective Research NETworking (CORNET) project AITIA: Embedded AI Techniques for Industrial Applications. CORNET-AITIA is funded by the BMWi (Federal Ministry for Economic Affairs and Energy) under the IGF-project number: 249 EBG.

References

  1. 1.
    Dorta, T., Jimenez, J., Martın, J.L., Bidarte, U., Astarloa, A.: Overview of FPGA-based multiprocessor systems. In: International Conference on Reconfigurable Computing and FPGAs, pp. 273–278, December 2009Google Scholar
  2. 2.
    Thomas, D.B., Howes, L., Luk, W.: A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2009, pp. 63–72. ACM, New York (2009)Google Scholar
  3. 3.
    Abdelouahab, K., Pelcat, M., Serot, J., Berry, F.: Accelerating CNN inference on FPGAs: a survey. CoRR abs/1806.01683 (2018). http://arxiv.org/abs/1806.01683
  4. 4.
    Ma, Y., Cao, Y., Vrudhula, S., Seo, J.: Optimizing the convolution operation to accelerate deep neural networks on FPGA. IEEE Trans. Very Large Scale Integr. VLSI Syst. 26(7), 1354–1367 (2018).  https://doi.org/10.1109/TVLSI.2018.2815603
  5. 5.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. CoRR abs/1603.05279 (2016). http://arxiv.org/abs/1603.05279
  6. 6.
    Zhang, W.-T., et al.: Design of heterogeneous MPSoC on FPGA. In: 7th International Conference on ASIC, pp. 102–105, October 2007Google Scholar
  7. 7.
    Ali, M., Amini Rad, P., Göhringer, D.: Power_Monitoring_Xilinx_ZCU102, February 2020. https://github.com/TUD-ADS/Power_Monitoring_Xilinx_ZCU102
  8. 8.
    Nurvitadhi, E., Sheffield, D., Jaewoong, S., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: International Conference on Field-Programmable Technology (FPT), pp. 77–84, December 2016Google Scholar
  9. 9.
    Feng, G., Hu, Z., Chen, S., Wu, F.: Energy-efficient and high-throughput FPGA-based accelerator for convolutional neural networks. In: 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), pp 624–626, October 2016Google Scholar
  10. 10.
    Theocharides, T., Link, G., Vijaykrishnan, N., Invin, M.J., Srikantam, V.: A generic reconfigurable neural network architecture as a network on chip. In: Proceedings of IEEE International SOC Conference, pp. 191–194, September 2004Google Scholar
  11. 11.
    Vainbrand, D., Ginosar, R.: Network-on-chip architectures for neural networks. In: Fourth ACM/IEEE International Symposium on Networks-on-Chip, pp. 135–144, May 2010Google Scholar
  12. 12.
    Thanh Bui, T.T., Phillips, B.: A scalable network-on-chip based neural network implementation on FPGAs. In: IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–6, March 2019Google Scholar
  13. 13.
    Vestias, M.P.: A survey of convolutional neural networks on edge with reconfigurable computing. Algorithms 12(8), 154 (2019)CrossRefGoogle Scholar
  14. 14.
    RISC-V. https://riscv.org/. Accessed 17 Feb 2020
  15. 15.
    Kamaleldin, A., Ali, M., Amini Rad, P., Gottschalk, M., Göhringer, D.: Modular memory system for RISC-V based MPSoCs on Xilinx FPGAs. In: IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pp. 68–73, October 2019Google Scholar
  16. 16.
    Davide Schiavone, P., et al.: Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications. In: 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), pp. 1–8, September 2017Google Scholar
  17. 17.
    Elmohr, M.A., et al.: RVNoC: a framework for generating RISC-V NoC-based MPSoC. In: 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), pp. 617–621, March 2018Google Scholar
  18. 18.
    Khamis, M., El-Ashry, S., Shalaby, A., AbdElsalam, M., El-Kharashi, M.W.: A configurable RISC-V for NoC-based MPSoCs: a framework for hardware emulation. In: 11th International Workshop on Network on Chip Architectures (NoCArc), pp. 1–6, October 2018Google Scholar
  19. 19.
    Garofalo, A., Rusci, M., Conti, F., Rossi, D., Benini, L.: PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors. CoRR abs/1908.11263 (2019). http://arxiv.org/abs/1908.11263
  20. 20.
    Beldachi, A.F., Nunez-Yanez, J.L.: Accurate power control and monitoring in ZYNQ boards. In: 24th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4, September 2014Google Scholar
  21. 21.
    Nunez-Yanez, J.: Energy proportional neural network inference with adaptive voltage and frequency scaling. IEEE Trans. Comput. 68(5), 676–687 (2019)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Maxim Integrated. https://www.maximintegrated.com/en/products/power. Accessed 17 Feb 2020
  23. 23.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  24. 24.
    Keras: CIFAR-10 CNN. https://keras.io/examples/cifar10_cnn/. Accessed 17 Feb 2020
  25. 25.
    Pulp-platform. https://github.com/pulp-platform/ri5cy_gnu_toolchain. Accessed 17 Feb 2020
  26. 26.
    Rettkowski, J., Göhringer, D.: ASIR: application-specific instruction-set router for NoC-based MPSoCs. Computers 7(3), 38 (2018)CrossRefGoogle Scholar
  27. 27.
    Gysel, P., Pimentel, J., Motamedi, M., Ghiasi, S.: Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 5784–5789 (2018)CrossRefGoogle Scholar
  28. 28.
    Open MPI: Open MPI: Open Source High Performance Computing. https://www.open-mpi.org/. Accessed 17 Feb 2020
  29. 29.
    Xilinx: ZCU102 Evaluation Board User Guide. https://www.xilinx.com/support/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdf. Accessed 17 Feb 2020
  30. 30.
    Xilinx: Zynq UltraScale+ MPSoC Software Developer Guide. https://www.xilinx.com/support/documentation/user_guides/ug1137-zynq-ultrascale-mpsoc-swdev.pdf. Accessed 17 Feb 2020
  31. 31.
    Texas Instruments: PCA9544A Low Voltage 4-Channel I2C and SMBus Multiplexer With Interrupt Logic. http://www.ti.com/lit/ds/symlink/pca9544a.pdf. Accessed 17 Feb 2020
  32. 32.
    Texas Instruments: INA226 high-side or low-side measurement, bidirectional current and power monitor with I2C compatible interface. http://www.ti.com/lit/ds/symlink/ina226.pdf. Accessed 17 Feb 2020
  33. 33.
    Maxim Integrated: InTune automatically compensated digital pol controller with driver and pmbus telemetry. https://datasheets.maximintegrated.com/en/ds/MAX15301.pdf. Accessed 17 Feb 2020
  34. 34.
    Xilinx: MicroBlaze Soft Processor Core. https://www.xilinx.com/products/design-tools/microblaze.html. Accessed 17 Feb 2020
  35. 35.
    Feng, G., Hu, Z., Chen, S., Wu, F.: Energy-efficient and high-throughput FPGA-based accelerator for convolutional neural networks. In: 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), pp. 624–626, October 2016Google Scholar
  36. 36.
    Lou, W., Wang, C., Gong, L., Zhou, X.: RV-CNN: flexible and efficient instruction set for CNNs based on RISC-V processors. In: Yew, P.-C., Stenström, P., Wu, J., Gong, X., Li, T. (eds.) APPT 2019. LNCS, vol. 11719, pp. 3–14. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-29611-7_1CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Muhammad Ali
    • 1
    Email author
  • Pedram Amini Rad
    • 1
  • Diana Göhringer
    • 1
  1. 1.Technische Universität DresdenDresdenGermany

Personalised recommendations