The Implementation and Optimization of FFT Calculation Based on the MT-3000 Chip

Cheng, Dong; Li, Guilan; Song, Aochuang; Xu, Bangjian

doi:10.1007/978-981-97-0065-3_3

Dong Cheng⁸,
Guilan Li⁸,
Aochuang Song⁹ &
…
Bangjian Xu⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2036))

Included in the following conference series:

BenchCouncil International Symposium on Intelligent Computers, Algorithms, and Applications

154 Accesses

Abstract

Based on the accelerator chip MT-3000, the FFT algorithm in SAR imaging has been implemented and optimized. The optimization includes vectorization and MPI, DMA transmission and dual buffer transmission, and linear assembly. The experimental results show that on the platform, the performance increased by more than 99.2% after the FFT function was optimized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Huang, J.: FFT based on YHFT-Matrix design and realize. National University of Defense Technology (2012)
Google Scholar
Xiang, H.: FFT algorithm design and implementation. National University of Defense Technology (2014)
Google Scholar
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Mathem. Comput. 19, 297–301 (1965)
Article MathSciNet Google Scholar
Duhamel, P., Hollmann, H.: Split radix FFT algorithm. Electron. Lett. 20(1), 14–16 (1984)
Article Google Scholar
Liu, H., Xie, Z.: Discussion and improvement of the FFT algorithm of the split foundation. Commun. Technol. 03, 124–125 (2008)
Google Scholar
Swartzlander, E.E., Young, W.K.W., Joseph, S.J.: A radix 4 delay commutator for fast Fourier transform processor implementation. IEEE J. Solid-State Circ. 19(5), 702–709 (1984)
Article Google Scholar
Bouguezel, S., Ahmad, M.O., Swamy, M.N.S.: Improved radix-4 and radix-8 FFT algorithms. In: 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No. 04CH37512), vol. 3, p. III-561. IEEE (2004)
Google Scholar
Singleton, R.C.: An algorithm for computing the mixed radix fast Fourier transform. IEEE Trans. Audio Electroac. AU 17(2), 93–103 (1969)
Google Scholar
Bai, X.: Research on the development of high performance computing. Military Civilian Dual-use Technol. Products (02), 26–29 (2023). https://doi.org/10.19385/J.CNKI.1009-8119.2023.02.001
Denham, M., Lamperti, E., Areta, J.: Weather radar data processing on graphic cards. J. Supercomput. 74(2), 868–885 (2018)
Article Google Scholar
Cui, Z., Quan, H., Cao, Z., Xu, S., Ding, C., Wu, J.: SAR Target CFAR detection via GPU parallel operation. IEEE J. Selected Topics Appli. Earth Observ. Remote Sensing 11(12), 4884–4894 (2018)
Article Google Scholar
Tiebin, W., Feng, G., Di, W.: Research on the core operation architecture of high-performance processor computing for E-Class. Comput. Eng. Sci. 45(05), 761–771 (2023)
Google Scholar
AMD Instinct MI200 datasheet [EB/OL]. https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instinct-mi200-datasheet.pdf
NVIDIA Hopper architecture whitepaper (H100 tensor coreGPU architecture)[EB/OL]. https://resources.nvidia.com/en-us-tensor-core
Liao, K.: MPI transplantation based on FT-C6XX multi-core DSP realizes and optimization. National University of Defense Technology (2015)
Google Scholar
Petersen, W., Arbenz, P.: Introduction to Parallel Computing: A practical guide with examples in C. OUP Oxford (2004)
Google Scholar
Liu, S., et al.: A self-designed heterogeneous integration accelerator for E-class high performance computing. Computer Res. Develop. 58(06), 1234–1237 (2021)
Google Scholar
Hu, Y.:32-bit high-performance M-DSP DMA design and verification of high-efficiency data transmission. National University of Defense Technology (2015)
Google Scholar
Guo, P., Chen, M., Liang, Z., Ma, X., Xu, B.: Optimization and implementation of the accumulation algorithm for the FT-M7002 platform. Computer Eng. Sci. 44(11), 1909–1917 (2022)
Google Scholar

Download references

Acknowledgements

Thanks a lot for the research help from National University of Defense Technology.

Author information

Authors and Affiliations

College of Electrical and Information Engineering, Hunan University, Changsha, China
Dong Cheng & Guilan Li
College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
Aochuang Song & Bangjian Xu

Authors

Dong Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Guilan Li
View author publications
You can also search for this author in PubMed Google Scholar
Aochuang Song
View author publications
You can also search for this author in PubMed Google Scholar
Bangjian Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bangjian Xu .

Editor information

Editors and Affiliations

Université de Bourgogne, Dijon, France
Christophe Cruz
Victoria University, Melbourne, VIC, Australia
Yanchun Zhang
Chinese Academy of Sciences, Beijing, China
Wanling Gao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, D., Li, G., Song, A., Xu, B. (2024). The Implementation and Optimization of FFT Calculation Based on the MT-3000 Chip. In: Cruz, C., Zhang, Y., Gao, W. (eds) Intelligent Computers, Algorithms, and Applications. IC 2023. Communications in Computer and Information Science, vol 2036. Springer, Singapore. https://doi.org/10.1007/978-981-97-0065-3_3

Download citation

DOI: https://doi.org/10.1007/978-981-97-0065-3_3
Published: 28 January 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0064-6
Online ISBN: 978-981-97-0065-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Implementation and Optimization of FFT Calculation Based on the MT-3000 Chip