Compressing Audio Signals with Inpainting-Based Sparsification

Peter, Pascal; Contelly, Jan; Weickert, Joachim

doi:10.1007/978-3-030-22368-7_8

Pascal Peter¹⁷,
Jan Contelly¹⁷ &
Joachim Weickert¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11603))

Included in the following conference series:

International Conference on Scale Space and Variational Methods in Computer Vision

1046 Accesses
3 Citations

Abstract

Inpainting techniques are becoming increasingly important for lossy image compression. In this paper, we investigate if successful ideas from inpainting-based codecs for images can be transferred to lossy audio compression. To this end, we propose a framework that creates a sparse representation of the audio signal directly in the sample-domain. We select samples with a greedy sparsification approach and store this optimised data with entropy coding. Decoding restores the missing samples with well-known 1-D interpolation techniques. Our evaluation on music pieces in a stereo format suggests that the lossy compression of our proof-of-concept framework is quantitatively competitive to transform-based audio codecs such as mp3, AAC, and Vorbis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.mia.uni-saarland.de/Publications/peter-ssvm19-supplement.zip.

References

Adam, R.D., Peter, P., Weickert, J.: Denoising by inpainting. In: Lauze, F., Dong, Y., Dahl, A.B. (eds.) SSVM 2017. LNCS, vol. 10302, pp. 121–132. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58771-4_10
Chapter Google Scholar
Adler, A., Emiya, V., Jafari, M.G., Elad, M., Gribonval, R., Plumbley, M.D.: Audio inpainting. IEEE Trans. Audio Speech Lang. Process. 20(3), 922–932 (2012)
Article Google Scholar
Andris, S., Peter, P., Weickert, J.: A proof-of-concept framework for PDE-based video compression. In: Proceedings 32nd Picture Coding Symposium (PCS 2016), Nuremberg, Germany, pp. 1–5, December 2016
Google Scholar
Bertalmío, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings SIGGRAPH 2000, New Orleans, LI, pp. 417–424, July 2000
Google Scholar
Capon, J.: A probabilistic model for run-length coding of pictures. IRE Trans. Inf. Theor. 5(4), 157–163 (1959)
Article MathSciNet Google Scholar
Catmull, E., Rom, R.: A class of local interpolating splines. In: Barnhill, R.E., Riesenfeld, R.F. (eds.) Computer Aided Geometric Design, pp. 317–326. Academic Press, New York (1974)
Chapter Google Scholar
Chen, Y., Ranftl, R., Pock, T.: A bi-level view of inpainting-based image compression. In: Proceedings 19th Computer Vision Winter Workshop, Křtiny, Czech Republic, pp. 19–26, February 2014
Google Scholar
Crowley, P.: Exploring the Forest (2013). Audio file, available under http://petercrowleyfantasydream.jimdo.com
de Boor, C.: A Practical Guide to Splines, Applied Mathematical Sciences, vol. 27. Springer, New York (1978)
Book Google Scholar
Duchon, J.: Interpolation des fonctions de deux variables suivant le principe de la flexion des plaques minces. RAIRO Analyse Numérique 10(3), 5–12 (1976)
Article MathSciNet Google Scholar
Edler, B., Purnhagen, H.: Parametric audio coding. In: Proceedings International Conference on Communication Technology Proceedings (WCC-ICCT 2000), vol. 1, Beijing, China, pp. 614–617, August 2000
Google Scholar
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings Seventh IEEE International Conference on Computer Vision, vol. 2, Corfu, Greece, pp. 1033–1038, September 1999
Google Scholar
Galić, I., Weickert, J., Welk, M., Bruhn, A., Belyaev, A., Seidel, H.P.: Image compression with anisotropic diffusion. J. Math. Imaging Vis. 31(2–3), 255–269 (2008)
Article MathSciNet Google Scholar
Gemmeke, J.F., Van Hamme, H., Cranen, B., Boves, L.: Compressive sensing for missing data imputation in noise robust speech recognition. IEEE J. Sel. Top. Sign. Process. 4(2), 272–287 (2010)
Article Google Scholar
Hoeltgen, L., et al.: Optimising spatial and tonal data for PDE-based inpainting. In: Bergounioux, M., Peyré, G., Schnörr, C., Caillau, J.P., Haberkorn, T. (eds.) Variational Methods in Image Analysis, De Gruyter, Berlin, pp. 35–83 (2017)
Google Scholar
Iijima, T.: Basic theory on normalization of pattern (in case of typical one-dimensional pattern). Bull. Electrotech. Lab. 26, 368–388 (1962). (in Japanese)
Google Scholar
ISO/IEC: Information technology - coding of moving pictures and associated audio - part 3: Audio, Standard, ISO/IEC 11172–3 (1992)
Google Scholar
ISO/IEC: Information technology - generic coding of moving pictures and associated audio - part 7: Avanced audio coding, Standard, ISO/IEC 13818–7 (1992)
Google Scholar
ISO/IEC: Information technology - coding of audio-visual objects - part 3: Audio, standard, ISO/IEC 14496–3 (2001)
Google Scholar
Maher, R.: A method for extrapolation of missing digital audio data. J. Audio Eng. Soc. 42(5), 350–357 (1994)
Google Scholar
Mahoney, M.: Adaptive weighing of context models for lossless data compression. Technical report, CS-2005-16, Florida Institute of Technology, Melbourne, FL, December 2005
Google Scholar
Mainberger, M., et al.: Optimising spatial and tonal data for homogeneous diffusion inpainting. In: Bruckstein, A.M., ter Haar Romeny, B.M., Bronstein, A.M., Bronstein, M.M. (eds.) SSVM 2011. LNCS, vol. 6667, pp. 26–37. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24785-9_3
Chapter Google Scholar
Masnou, S., Morel, J.M.: Level lines based disocclusion. In: Proceedings 1998 IEEE International Conference on Image Processing, vol. 3, Chicago, IL, pp. 259–263, October 1998
Google Scholar
McAulay, R., Quatieri, T.: Computationally efficient sine-wave synthesis and its application to sinusoidal transform coding. In: Proceedings International Conference on Acoustics, Speech, and Signal Processing (ICASSP-88), New York, NY, pp. 370–373, April 1988
Google Scholar
Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard. Springer, New York (1992)
Google Scholar
Peter, P., Hoffmann, S., Nedwed, F., Hoeltgen, L., Weickert, J.: Evaluating the true potential of diffusion-based inpainting in a compression context. Sign. Process. Image Commun. 46, 40–53 (2016)
Article Google Scholar
Schmaltz, C., Peter, P., Mainberger, M., Ebel, F., Weickert, J., Bruhn, A.: Understanding, optimising, and extending data compression with anisotropic diffusion. Int. J. Comput. Vis. 108(3), 222–240 (2014)
Article MathSciNet Google Scholar
Schönlieb, C.B.: Partial Differential Equation Methods for Image Inpainting, Cambridge Monographs on Applied and Computational Mathematics, vol. 29. Cambridge University Press, Cambridge (2015)
Book Google Scholar
Schuller, G.D., Yu, B., Huang, D., Edler, B.: Perceptual audio coding using adaptive pre-and post-filters and lossless compression. IEEE Trans. Speech Audio Process. 10(6), 379–390 (2002)
Article Google Scholar
Spanias, A., Painter, T., Atti, V.: Audio Signal Processing and Coding. Wiley, Hoboken (2006)
Google Scholar
Taubman, D.S., Marcellin, M.W. (eds.): JPEG 2000: Image Compression Fundamentals, Standards and Practice. Kluwer, Boston (2002)
Google Scholar
Vincent, E., Plumbley, M.D.: Low bit-rate object coding of musical audio using Bayesian harmonic models. IEEE Trans. Audio Speech Lang. Process. 15(4), 1273–1282 (2007)
Article Google Scholar
Weickert, J.: Theoretical foundations of anisotropic diffusion in image processing. In: Kropatsch, W., Klette, R., Solina, F., Albrecht, R. (eds.) Theoretical Foundations of Computer Vision, Computing Supplement, vol. 11, pp. 221–236. Springer, Vienna (1996). https://doi.org/10.1007/978-3-7091-6586-7_13
Chapter Google Scholar
Xiph.Org Foundation: Vorbis I specification (2015). https://xiph.org/vorbis/doc/Vorbis_I_spec.html

Download references

Acknowledgements

We thank Jan Østergaard (Aalborg University) for the valuable discussions that allowed us to improve our work. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 741215, ERC Advanced Grant INCOVID).

Author information

Authors and Affiliations

Mathematical Image Analysis Group, Faculty of Mathematics and Computer Science, Saarland University, Campus E1.7, 66041, Saarbrücken, Germany
Pascal Peter, Jan Contelly & Joachim Weickert

Authors

Pascal Peter
View author publications
You can also search for this author in PubMed Google Scholar
Jan Contelly
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Weickert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pascal Peter .

Editor information

Editors and Affiliations

University of Lübeck, Lübeck, Germany
Jan Lellmann
University of Erlangen-Nuremberg (FAU), Erlangen, Germany
Martin Burger
University of Lübeck, Lübeck, Germany
Jan Modersitzki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peter, P., Contelly, J., Weickert, J. (2019). Compressing Audio Signals with Inpainting-Based Sparsification. In: Lellmann, J., Burger, M., Modersitzki, J. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2019. Lecture Notes in Computer Science(), vol 11603. Springer, Cham. https://doi.org/10.1007/978-3-030-22368-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-22368-7_8
Published: 05 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22367-0
Online ISBN: 978-3-030-22368-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics