Abstract
Inpainting techniques are becoming increasingly important for lossy image compression. In this paper, we investigate if successful ideas from inpainting-based codecs for images can be transferred to lossy audio compression. To this end, we propose a framework that creates a sparse representation of the audio signal directly in the sample-domain. We select samples with a greedy sparsification approach and store this optimised data with entropy coding. Decoding restores the missing samples with well-known 1-D interpolation techniques. Our evaluation on music pieces in a stereo format suggests that the lossy compression of our proof-of-concept framework is quantitatively competitive to transform-based audio codecs such as mp3, AAC, and Vorbis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adam, R.D., Peter, P., Weickert, J.: Denoising by inpainting. In: Lauze, F., Dong, Y., Dahl, A.B. (eds.) SSVM 2017. LNCS, vol. 10302, pp. 121–132. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58771-4_10
Adler, A., Emiya, V., Jafari, M.G., Elad, M., Gribonval, R., Plumbley, M.D.: Audio inpainting. IEEE Trans. Audio Speech Lang. Process. 20(3), 922–932 (2012)
Andris, S., Peter, P., Weickert, J.: A proof-of-concept framework for PDE-based video compression. In: Proceedings 32nd Picture Coding Symposium (PCS 2016), Nuremberg, Germany, pp. 1–5, December 2016
Bertalmío, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings SIGGRAPH 2000, New Orleans, LI, pp. 417–424, July 2000
Capon, J.: A probabilistic model for run-length coding of pictures. IRE Trans. Inf. Theor. 5(4), 157–163 (1959)
Catmull, E., Rom, R.: A class of local interpolating splines. In: Barnhill, R.E., Riesenfeld, R.F. (eds.) Computer Aided Geometric Design, pp. 317–326. Academic Press, New York (1974)
Chen, Y., Ranftl, R., Pock, T.: A bi-level view of inpainting-based image compression. In: Proceedings 19th Computer Vision Winter Workshop, Křtiny, Czech Republic, pp. 19–26, February 2014
Crowley, P.: Exploring the Forest (2013). Audio file, available under http://petercrowleyfantasydream.jimdo.com
de Boor, C.: A Practical Guide to Splines, Applied Mathematical Sciences, vol. 27. Springer, New York (1978)
Duchon, J.: Interpolation des fonctions de deux variables suivant le principe de la flexion des plaques minces. RAIRO Analyse Numérique 10(3), 5–12 (1976)
Edler, B., Purnhagen, H.: Parametric audio coding. In: Proceedings International Conference on Communication Technology Proceedings (WCC-ICCT 2000), vol. 1, Beijing, China, pp. 614–617, August 2000
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings Seventh IEEE International Conference on Computer Vision, vol. 2, Corfu, Greece, pp. 1033–1038, September 1999
Galić, I., Weickert, J., Welk, M., Bruhn, A., Belyaev, A., Seidel, H.P.: Image compression with anisotropic diffusion. J. Math. Imaging Vis. 31(2–3), 255–269 (2008)
Gemmeke, J.F., Van Hamme, H., Cranen, B., Boves, L.: Compressive sensing for missing data imputation in noise robust speech recognition. IEEE J. Sel. Top. Sign. Process. 4(2), 272–287 (2010)
Hoeltgen, L., et al.: Optimising spatial and tonal data for PDE-based inpainting. In: Bergounioux, M., Peyré, G., Schnörr, C., Caillau, J.P., Haberkorn, T. (eds.) Variational Methods in Image Analysis, De Gruyter, Berlin, pp. 35–83 (2017)
Iijima, T.: Basic theory on normalization of pattern (in case of typical one-dimensional pattern). Bull. Electrotech. Lab. 26, 368–388 (1962). (in Japanese)
ISO/IEC: Information technology - coding of moving pictures and associated audio - part 3: Audio, Standard, ISO/IEC 11172–3 (1992)
ISO/IEC: Information technology - generic coding of moving pictures and associated audio - part 7: Avanced audio coding, Standard, ISO/IEC 13818–7 (1992)
ISO/IEC: Information technology - coding of audio-visual objects - part 3: Audio, standard, ISO/IEC 14496–3 (2001)
Maher, R.: A method for extrapolation of missing digital audio data. J. Audio Eng. Soc. 42(5), 350–357 (1994)
Mahoney, M.: Adaptive weighing of context models for lossless data compression. Technical report, CS-2005-16, Florida Institute of Technology, Melbourne, FL, December 2005
Mainberger, M., et al.: Optimising spatial and tonal data for homogeneous diffusion inpainting. In: Bruckstein, A.M., ter Haar Romeny, B.M., Bronstein, A.M., Bronstein, M.M. (eds.) SSVM 2011. LNCS, vol. 6667, pp. 26–37. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24785-9_3
Masnou, S., Morel, J.M.: Level lines based disocclusion. In: Proceedings 1998 IEEE International Conference on Image Processing, vol. 3, Chicago, IL, pp. 259–263, October 1998
McAulay, R., Quatieri, T.: Computationally efficient sine-wave synthesis and its application to sinusoidal transform coding. In: Proceedings International Conference on Acoustics, Speech, and Signal Processing (ICASSP-88), New York, NY, pp. 370–373, April 1988
Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard. Springer, New York (1992)
Peter, P., Hoffmann, S., Nedwed, F., Hoeltgen, L., Weickert, J.: Evaluating the true potential of diffusion-based inpainting in a compression context. Sign. Process. Image Commun. 46, 40–53 (2016)
Schmaltz, C., Peter, P., Mainberger, M., Ebel, F., Weickert, J., Bruhn, A.: Understanding, optimising, and extending data compression with anisotropic diffusion. Int. J. Comput. Vis. 108(3), 222–240 (2014)
Schönlieb, C.B.: Partial Differential Equation Methods for Image Inpainting, Cambridge Monographs on Applied and Computational Mathematics, vol. 29. Cambridge University Press, Cambridge (2015)
Schuller, G.D., Yu, B., Huang, D., Edler, B.: Perceptual audio coding using adaptive pre-and post-filters and lossless compression. IEEE Trans. Speech Audio Process. 10(6), 379–390 (2002)
Spanias, A., Painter, T., Atti, V.: Audio Signal Processing and Coding. Wiley, Hoboken (2006)
Taubman, D.S., Marcellin, M.W. (eds.): JPEG 2000: Image Compression Fundamentals, Standards and Practice. Kluwer, Boston (2002)
Vincent, E., Plumbley, M.D.: Low bit-rate object coding of musical audio using Bayesian harmonic models. IEEE Trans. Audio Speech Lang. Process. 15(4), 1273–1282 (2007)
Weickert, J.: Theoretical foundations of anisotropic diffusion in image processing. In: Kropatsch, W., Klette, R., Solina, F., Albrecht, R. (eds.) Theoretical Foundations of Computer Vision, Computing Supplement, vol. 11, pp. 221–236. Springer, Vienna (1996). https://doi.org/10.1007/978-3-7091-6586-7_13
Xiph.Org Foundation: Vorbis I specification (2015). https://xiph.org/vorbis/doc/Vorbis_I_spec.html
Acknowledgements
We thank Jan Østergaard (Aalborg University) for the valuable discussions that allowed us to improve our work. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 741215, ERC Advanced Grant INCOVID).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Peter, P., Contelly, J., Weickert, J. (2019). Compressing Audio Signals with Inpainting-Based Sparsification. In: Lellmann, J., Burger, M., Modersitzki, J. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2019. Lecture Notes in Computer Science(), vol 11603. Springer, Cham. https://doi.org/10.1007/978-3-030-22368-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-22368-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22367-0
Online ISBN: 978-3-030-22368-7
eBook Packages: Computer ScienceComputer Science (R0)