Abstract
This paper proposes the use of a real-time progressive image compression and region of interest algorithm for the ARM processor architecture. This algorithm is used for the design of an underwater image sensor for an autonomous underwater vehicle for intervention, under a highly constrained available bandwidth scenario, allowing for a more agile data exchange between the vehicle and a human operator supervising the underwater intervention. For high compression ratios (smaller output size), execution time is dominated by the transformation algorithm, which plays a progressively smaller role as the compression ratio gets smaller (larger output size). A novel progressive rate distortion-optimized image compression algorithm based on the discrete wavelet transform (DWT) is presented, with special emphasis on a novel minimal time parallel DWT algorithm, which allows full memory bandwidth saturation using only a few cores of a modern multicore embedded processor. The paper focuses in a novel efficient inplace, multithreaded, and cache-friendly parallel 2-D wavelet transform algorithm, based on the lifting transform using the ARM Architecture. In order to maximize the cache utilization and consequently minimize the memory bus bandwidth use, the threads compete to work on a small memory area, maximizing the chances of finding the data in the cache. Their synchronization is done with very low overhead, without the use of any locks and relying solely on the basic compare-and-swap atomic primitive. An implementation in C programming language with and without the use of vector instructions (single instruction multiple data) is provided for both, single (serial) and multi-(parallel) threaded single-loop DWT implementations, as well as serial and parallel naive implementations using linear (row order) and strided (column order) memory access patterns for comparison. Results show a significant improvement over the single-threaded optimized implementation and a much greater improvement over both, the single- and multi-threaded naive implementations, reaching minimal running time depending on the memory access pattern, the number of processor cores, and the available memory bus bandwidth, i.e., it becomes memory bound using the minimum number of memory accesses. Due to memory saturation, the inplace 2-D DWT transform can be executed in the same time as a 1-D DWT transform or as an inplace memory block copy.
Similar content being viewed by others
Notes
Assuming a write-back cache policy.
References
Sanz, P.J., Pealver, A., Sales, J., Fernndez, J.J., Prez, J., Fornas, D., Garca, J.C., Marin, R.: Multipurpose underwater manipulation for archaeological intervention. In: Sixth International Workshop on Marine Technology (MARTECH’15), Cartagena, Spain (2015)
Centelles, D., Moscoso, E., Vallicrosa, G., Palomeras, N., Sales, J., Mart, J.V., Marín, R., Ridao, P., Sanz, P.J.: Wireless HROV control with compressed visual feedback over an acoustic link. In: OCEANS 2017—Aberdeen, IEEE, pp. 1–7 (2017). https://doi.org/10.1109/OCEANSE.2017.8084979
Sanz, P.J., Prats, M., Ridao, P., Ribas, D., Oliver, G., Orti, A.: Recent progress in the RAUVI project. A reconfigurable autonomous underwater vehicle for intervention. In: 52-th International Symposium ELMAR-2010, Zadar, Croatia, pp. 471–474 (2010)
Sanz, P.J., Peñalver, A., Sales, J., Fornas, D., Fernández, J.J., Perez, J., Bernabé, J.A.: GRASPER: a multisensory based manipulation system for underwater operations. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Manchester, UK. IEEE, New York (2013)
Sanz, P.J., Ridao, P., Oliver, G., Casalino, G., Petillot, Y., Silvestre, C., Melchiorri, C., Turetta, A.: TRIDENT: an European project targeted to increase the autonomy levels for underwater intervention missions. In: OCEANS13 MTS/IEEE Conference, San Diego, CA, pp. 1–10 (2013)
Kaeli, J.W.: Computational strategies for understanding underwater optical image datasets. Thesis, Massachusetts Institute of Technology, p. 135 (2013)
Suzuki, M., Sasaki, T.: Digital acoustic image transmission system for deep sea research submersible. In: Proceedings of IEEE Oceans’92 Conference, pp. 567–570 (1992)
Gomes, J., Barroso, V., Ayela, G., Coince, P.: An overview of the ASIMOV acoustic communication system. In: OCEANS 2000 MTS/IEEE Conference and Exhibition. Conference Proceedings (Cat. No.00CH37158), vol. 3, pp. 1633–1637 (2000)
Hoag, D.F., Ingle, V.K., Gaudette, R.J.: Low-bit-rate coding of underwater video using wavelet-based compression algorithms. IEEE J. Ocean. Eng. 22(2), 393–400 (1997)
Negahdaripour, S., Khamene, A.: Motion-based compression of underwater video imagery for the operations of unmanned submersible vehicles. Comput. Vis. Image Underst. 79(1), 162–183 (2000)
Pelekanakis, K.: Design and analysis of a high-rate acoustic link for underwater video transmission. Thesis, Massachusetts Institute of Technology, p. 75 (2004)
Walker, J.S., Nguyen, T.Q., Chen, Y.-J.: A low-power, low-memory system for wavelet-based image compression. Opt. Eng. Res. Signposts 5, 111–125 (2003)
Mallat, S.: A Wavelet Tour of Signal Processing. The Sparse Way, 3rd edn, p. 805. Elsevier, Amsterdam (2009)
Murphy, C.A.: Progressively communicating rich telemetry from autonomous underwater vehicles via relays. Thesis, Massachusetts Institute of Technology and Woods Hole Oceanographic Institution, p. 75 (2004)
Senapati, R.K., Pati, U.C., Mahapatra, K.K.: Listless block-tree set partitioning algorithm for very low bit rate embedded image compression. AEU Int. J. Electron. Commun. 66(12), 985–995 (2012)
Zhang, Y., Negahdaripour, S., Li, Q.: Low bit-rate compression of underwater imagery based on adaptive hybrid wavelets and directional filter banks. Signal Process. Image Commun. 47, 96–114 (2016)
Esmaiel, H.A.H.: Advanced multi-band modulation technology for underwater communication systems. Doctoral Thesis, University of Tasmania (2015)
Jiang, D., Esmaiel, H.: Optimum bit rate for image transmission over underwater acoustic channel. J. Electr. Electron. Eng. 2(4), 64–74 (2014)
Tomasi, B., Toni, L., Casari, P., Preisig, J., Zorzi, M.: A study on the spiht image coding technique for underwater acoustic communications. In: Proceedings of the Sixth ACM International Workshop on Underwater Networks, WUWNet ’11, New York, NY, USA, pp. 9:1–9:8. ACM, New York (2011)
Santoso, T.B., Wirawan, Hendrantoro, G.: Image transmission with OFDM technique in underwater acoustic environment. In: 2012 7th International Conference on Telecommunication Systems, Services, and Applications (TSSA), pp. 37–41 (2012)
Eastwood, R.L., Freitag, L.E., Catipovic, J.A.: Compression techniques for improving underwater acoustic transmission of images and data. In: OCEANS 96 MTS/IEEE Conference Proceedings. The Coastal Ocean—Prospects for the 21st Century, p 67 suppl. (1996)
Sun, Y., Li, R., Cao, X.: Image compression method of terrain based on Antonini wavelet transform. In: Proceedings. 2005 IEEE International Geoscience and Remote Sensing Symposium, 2005. IGARSS ’05, vol. 2 (2005)
Zhang, L.-B.: A New Region of Interest Image Coding for Narrowband Network: Partial Bitplane Alternating Shift, pp. 425–432. Springer, Berlin (2005)
Rubino, E.M., Sales, J., Centelles, D., Sanz, P.J., Marn, R., Marti, J.V.: Image compression with region of interest for underwater robotic archaeological applications. In: XXXVI Jornadas de Automtica, Bilbao, Spain (2015)
Rubino, E.M., Centelles, D., Sales, J., Martí, J.V., Marín, R., Sanz, P.J., Álvares, A.J.: Underwater radio frequency image sensor using progressive image compression and region of interest. J. Braz. Soc. Mech. Sci. Eng. 39(10), 4115–4134 (2017)
Daubechies, I., Sweldens, W.: Factoring wavelet transforms into lifting steps. J. Fourier Anal. Appl. 4(3), 247–269 (1998)
Barina, D.: Lifting scheme cores for wavelet transform. Thesis, Brno University of Technology, Brno, p. 80 (2015)
Varbanescu, A.L.: On the effective parallel programming of multi-core processors. Thesis, Universitatea Politehnica Bucuresti, Romania, p. 217 (2010)
Colom-Palero, R.J., Gadea-Girones, R., Ballester-Merelo, F.J., Martnez-Peiro, M.: Flexible architecture for the implementation of the two-dimensional discrete wavelet transform (2D-DWT) oriented to FPGA devices. Microprocess. Microsyst. 28(9), 509–518 (2004)
Xionga, C., Houa, J., Tianb, J., Liub, J.: Efficient array architectures for multi-dimensional lifting-based discrete wavelet transforms. Signal Process. 87, 1089–1099 (2007)
Bozinovic, R., Markovic, Z.: Fast DWT-based intermediate video codec optimized for massively parallel architecture. US Patent 9,451,291, 20 Sept 2016
Jalba, A.C., van der Laan, W.J., Roerdink, J.B.T.M.: Accelerating wavelet lifting on graphics hardware using CUDA. IEEE Trans. Parallel Distrib. Syst. 22, 132–146 (2011)
Barina, D., Klima, O., Zemcik, P.: Single-loop software architecture for JPEG 2000. In: Data Compression Conference (DCC) (2016)
Barina, D., Zemcik, P.: Vectorization and parallelization of 2-D wavelet lifting. J. Real Time Image Process. 15, 1–13 (2015)
Lai, B.-C.C., Li, K.-C., Li, G.-R., Chiang, C.-H.: Self adaptable multithreaded object detection on embedded multicore systems. J. Parallel Distrib. Comput. 78, 25–38 (2015)
Hutcheson, A., Natoli, V.: Memory bound vs. compute bound: a quantitative study of cache and memory bandwidth in high performance applications. Technical report, Stone Ridge Technology, Charlottesville, Virginia (2011)
McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pp. 19–25 (1995)
Chaver, D., Tenllado, C., Piñuel, L., Prieto, M., Tirado, F.: 2-D Wavelet Transform Enhancement on General-Purpose Microprocessors: Memory Hierarchy and SIMD Parallelism Exploitation, pp. 9–21. Springer, Berlin (2002)
Chaver, D., Tenllado, C., Piñuel, L., Prieto, M., Tirado, F.: Wavelet Transform for Large Scale Image Processing on Modern Microprocessors, pp. 549–562. Springer, Berlin (2003)
Barina, D., Klima, O., Zemcik, P.: Minimum memory vectorisation of wavelet lifting. In: Advanced Concepts for Intel. Vision Systems (ACIVS), pp. 91–101 (2013)
Pelekanakis, C., Stojanovic, M., Freitag, L.: High rate acoustic link for underwater video transmission. In: OCEANS 2003. Proceedings, vol. 2, pp. 1091–1097 (2003)
Ribas, J., Sura, D., Stojanovic, M.: Underwater wireless video transmission for supervisory control and inspection using acoustic OFDM. OCEANS 2010, 1–9 (2010)
Farr, N., Bowen, A., Ware, J., Pontbriand, C., Tivey, M.: An integrated, underwater optical/acoustic communications system. In: OCEANS 2010 IEEE—Sydney, pp. 1–6 (2010)
Stojanovic, M., Preisig, J.: Underwater acoustic communication channels: propagation models and statistical characterization. IEEE Commun. Mag. 47(1), 84–89 (2009)
Kebkal, O., Komar, M., Kebkal, K., Bannasch, R.: D-MAC: Media access control architecture for underwater acoustic sensor networks. In: OCEANS 2011 IEEE—Spain, pp. 1–8 (2011)
Fernndez, J.J., Prez, J., Pealver, A., Sales, J., Fornas, D., Sanz, P.J.: Benchmarking using UWSim, Simurv and ROS: an autonomous free floating dredging intervention case study. In: OCEANS 2015—Genoa, pp. 1–7 (2015)
Rubino, E.M.: Extremely low and variable bandwidth image compression with region of interest applied to real time underwater robotic interventions. Doctoral Thesis, University of Jaume I (2018)
Rubino, E.M., lvares, A.J., Sanz, R., Marn, R.: A general scheme for finding the static rate–distortion optimized ordering for the bits of the coefficients of all subbands of an N-level dyadic biorthogonal DWT. Signal Process. Image Commun. 67, 210–230 (2018)
Xiong, Z., Guleryuz, O.G., Orchard, M.T.: A DCT-based embedded image coder. IEEE Signal Process. Lett. 3(11), 289–290 (1996)
Taubman, D., Ordentlich, E., Weinberger, M., Seroussi, G.: Embedded block coding in JPEG 2000. Signal Process. Image Commun. 17(1), 49–72 (2002)
Shapiro, J.M.: Embedded image coding using zerotrees of wavelet coefficients. IEEE Trans. Signal Process. 41(12), 3445–3462 (1993)
Said, A., Pearlman, W.A.: A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Trans. Circuits Syst. Video Technol. 6(3), 243–250 (1996)
Pearlman, W.A., Islam, A., Nagaraj, N., Said, A.: Efficient, low-complexity image coding with a set-partitioning embedded block coder. IEEE Trans. Circuits Syst. Video Technol. 14(11), 1219–1235 (2004)
Wheeler, F.W., Pearlman, W.A.: Combined spatial and subband block coding of images. In: 2000 International Conference on Image Processing, 2000. Proceedings, vol. 3, pp. 861–864 (2000)
Moinuddin, A.A., Khan, E.: Wavelet based embedded image coding using unified zero-block-zero-tree approach. In: 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings, vol. 2, p. II (2006)
Hong, E.S., Ladner, R.E.: Group testing for image compression. IEEE Trans. Image Process. 11(8), 901–911 (2002)
Cohen, A., Daubechies, I., Feauveau, J.-C.: Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45(5), 485–560 (1992)
Shahbahrami, A., Juurlink, B., Vassiliadis, S.: Implementing the 2-D wavelet transform on SIMD-enhanced general-purpose processors. IEEE Trans. Multimedia 10(1), 43–51 (2008)
McCalpin, J.D.: Stream: Sustainable memory bandwidth in high performance computers. Technical report, University of Virginia, Charlottesville, Virginia, 1991–2007. A continually updated technical report. http://www.cs.virginia.edu/stream/. Accessed 1 Jan 2018
Kutil, R.: A single-loop approach to simd parallelization of 2d wavelet lifting. In: 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP’06) (2006)
Barina, D., Kula, M., Zemcik, P.: Parallel wavelet schemes for images. J. Real Time Image Process. (2016). https://doi.org/10.1007/s11554-016-0646-3
Ikuzawa, T., Ino, F., Hagihara, K.: Reducing memory usage by the lifting-based discrete wavelet transform with a unified buffer on a GPU. J. Parallel Distrib. Comput. 9394, 44–55 (2016)
Karp, A., Collard, J.F.: Synchronization of threads in a multithreaded computer program. US Patent App. 10/870,721, 22 Dec 2005
Gregg, B.: Prentice hall, 1st edn. In: Systems Performance Enterprise and the Cloud, p. 1128 (2014)
Ruggiero, J.: Intel corporation, report. In: Measuring Cache and Memory Latency and CPU to Memory Bandwidth for use with Intel Architecture, p. 14 (2008)
Intel: Intel corporation, manual. In: Intel 64 and IA-32 Architectures Software Developers Manual, p. 4684 (2016)
Rubino, E.M., Alvares, A.J., Prades, R.M., Valero, P.S.: A novel minimum time parallel 2-D discrete wavelet transform algorithm for general purpose processors. In: 2017 46th International Conference on Parallel Processing (ICPP), pp. 553–562 (2017)
Rehna, V.J., Jeya Kumar, M.K.: Wavelet based image coding schemes: a recent survey. CoRR (2012). arXiv:abs/1209.2515
Adams, M.D., Kossentini, F.: Jasper: a software-based JPEG-2000 codec implementation. In: 2000 International Conference on Image Processing, 2000. Proceedings, vol. 2, pp. 53–56 (2000)
Tian, J., Wells, R.O.: A lossy image codec based on index coding. In: Data Compression Conference, 1996. DCC ’96. Proceedings, p. 456 (1996)
Tian, J., Wells, R.O.: Embedded Image Coding Using Wavelet Difference Reduction, pp. 289–301. Springer US, Boston (2002)
Walker, J.S., Nguyen, T.Q.: Adaptive scanning methods for wavelet difference reduction in lossy image compression. In: Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101), vol. 3, pp. 182–185 (2000)
Acknowledgements
This work was partly supported by the Spanish Ministry under grants DPI2014-57746-C3 (MERBO-TS Project) and DPI2017-86372-C3-1-R (TWINBOTS), by Universitat Jaume I grants P1-1B2015-68 (MASUMIA), E-2015-24, PREDOC/2012/47 and PREDOC/2013/46, by Generalitat Valenciana grant ACIF/2014/298 and (PROMETEO/2016/066), and by the Brazil CNPQ and FAP/DF.
Author information
Authors and Affiliations
Corresponding author
Appendix: Vertical naive algorithm—cnaive
Appendix: Vertical naive algorithm—cnaive
Rights and permissions
About this article
Cite this article
Rubino, E.M., Álvares, A.J., Marín, R. et al. Real-time rate distortion-optimized image compression with region of interest on the ARM architecture for underwater robotics applications. J Real-Time Image Proc 16, 193–225 (2019). https://doi.org/10.1007/s11554-018-0833-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-018-0833-5