Skip to main content

Compressing Audio Signals with Inpainting-Based Sparsification

  • Conference paper
  • First Online:
Scale Space and Variational Methods in Computer Vision (SSVM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11603))

Abstract

Inpainting techniques are becoming increasingly important for lossy image compression. In this paper, we investigate if successful ideas from inpainting-based codecs for images can be transferred to lossy audio compression. To this end, we propose a framework that creates a sparse representation of the audio signal directly in the sample-domain. We select samples with a greedy sparsification approach and store this optimised data with entropy coding. Decoding restores the missing samples with well-known 1-D interpolation techniques. Our evaluation on music pieces in a stereo format suggests that the lossy compression of our proof-of-concept framework is quantitatively competitive to transform-based audio codecs such as mp3, AAC, and Vorbis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.mia.uni-saarland.de/Publications/peter-ssvm19-supplement.zip.

References

  1. Adam, R.D., Peter, P., Weickert, J.: Denoising by inpainting. In: Lauze, F., Dong, Y., Dahl, A.B. (eds.) SSVM 2017. LNCS, vol. 10302, pp. 121–132. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58771-4_10

    Chapter  Google Scholar 

  2. Adler, A., Emiya, V., Jafari, M.G., Elad, M., Gribonval, R., Plumbley, M.D.: Audio inpainting. IEEE Trans. Audio Speech Lang. Process. 20(3), 922–932 (2012)

    Article  Google Scholar 

  3. Andris, S., Peter, P., Weickert, J.: A proof-of-concept framework for PDE-based video compression. In: Proceedings 32nd Picture Coding Symposium (PCS 2016), Nuremberg, Germany, pp. 1–5, December 2016

    Google Scholar 

  4. Bertalmío, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings SIGGRAPH 2000, New Orleans, LI, pp. 417–424, July 2000

    Google Scholar 

  5. Capon, J.: A probabilistic model for run-length coding of pictures. IRE Trans. Inf. Theor. 5(4), 157–163 (1959)

    Article  MathSciNet  Google Scholar 

  6. Catmull, E., Rom, R.: A class of local interpolating splines. In: Barnhill, R.E., Riesenfeld, R.F. (eds.) Computer Aided Geometric Design, pp. 317–326. Academic Press, New York (1974)

    Chapter  Google Scholar 

  7. Chen, Y., Ranftl, R., Pock, T.: A bi-level view of inpainting-based image compression. In: Proceedings 19th Computer Vision Winter Workshop, Křtiny, Czech Republic, pp. 19–26, February 2014

    Google Scholar 

  8. Crowley, P.: Exploring the Forest (2013). Audio file, available under http://petercrowleyfantasydream.jimdo.com

  9. de Boor, C.: A Practical Guide to Splines, Applied Mathematical Sciences, vol. 27. Springer, New York (1978)

    Book  Google Scholar 

  10. Duchon, J.: Interpolation des fonctions de deux variables suivant le principe de la flexion des plaques minces. RAIRO Analyse Numérique 10(3), 5–12 (1976)

    Article  MathSciNet  Google Scholar 

  11. Edler, B., Purnhagen, H.: Parametric audio coding. In: Proceedings International Conference on Communication Technology Proceedings (WCC-ICCT 2000), vol. 1, Beijing, China, pp. 614–617, August 2000

    Google Scholar 

  12. Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings Seventh IEEE International Conference on Computer Vision, vol. 2, Corfu, Greece, pp. 1033–1038, September 1999

    Google Scholar 

  13. Galić, I., Weickert, J., Welk, M., Bruhn, A., Belyaev, A., Seidel, H.P.: Image compression with anisotropic diffusion. J. Math. Imaging Vis. 31(2–3), 255–269 (2008)

    Article  MathSciNet  Google Scholar 

  14. Gemmeke, J.F., Van Hamme, H., Cranen, B., Boves, L.: Compressive sensing for missing data imputation in noise robust speech recognition. IEEE J. Sel. Top. Sign. Process. 4(2), 272–287 (2010)

    Article  Google Scholar 

  15. Hoeltgen, L., et al.: Optimising spatial and tonal data for PDE-based inpainting. In: Bergounioux, M., Peyré, G., Schnörr, C., Caillau, J.P., Haberkorn, T. (eds.) Variational Methods in Image Analysis, De Gruyter, Berlin, pp. 35–83 (2017)

    Google Scholar 

  16. Iijima, T.: Basic theory on normalization of pattern (in case of typical one-dimensional pattern). Bull. Electrotech. Lab. 26, 368–388 (1962). (in Japanese)

    Google Scholar 

  17. ISO/IEC: Information technology - coding of moving pictures and associated audio - part 3: Audio, Standard, ISO/IEC 11172–3 (1992)

    Google Scholar 

  18. ISO/IEC: Information technology - generic coding of moving pictures and associated audio - part 7: Avanced audio coding, Standard, ISO/IEC 13818–7 (1992)

    Google Scholar 

  19. ISO/IEC: Information technology - coding of audio-visual objects - part 3: Audio, standard, ISO/IEC 14496–3 (2001)

    Google Scholar 

  20. Maher, R.: A method for extrapolation of missing digital audio data. J. Audio Eng. Soc. 42(5), 350–357 (1994)

    Google Scholar 

  21. Mahoney, M.: Adaptive weighing of context models for lossless data compression. Technical report, CS-2005-16, Florida Institute of Technology, Melbourne, FL, December 2005

    Google Scholar 

  22. Mainberger, M., et al.: Optimising spatial and tonal data for homogeneous diffusion inpainting. In: Bruckstein, A.M., ter Haar Romeny, B.M., Bronstein, A.M., Bronstein, M.M. (eds.) SSVM 2011. LNCS, vol. 6667, pp. 26–37. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24785-9_3

    Chapter  Google Scholar 

  23. Masnou, S., Morel, J.M.: Level lines based disocclusion. In: Proceedings 1998 IEEE International Conference on Image Processing, vol. 3, Chicago, IL, pp. 259–263, October 1998

    Google Scholar 

  24. McAulay, R., Quatieri, T.: Computationally efficient sine-wave synthesis and its application to sinusoidal transform coding. In: Proceedings International Conference on Acoustics, Speech, and Signal Processing (ICASSP-88), New York, NY, pp. 370–373, April 1988

    Google Scholar 

  25. Pennebaker, W.B., Mitchell, J.L.: JPEG: Still Image Data Compression Standard. Springer, New York (1992)

    Google Scholar 

  26. Peter, P., Hoffmann, S., Nedwed, F., Hoeltgen, L., Weickert, J.: Evaluating the true potential of diffusion-based inpainting in a compression context. Sign. Process. Image Commun. 46, 40–53 (2016)

    Article  Google Scholar 

  27. Schmaltz, C., Peter, P., Mainberger, M., Ebel, F., Weickert, J., Bruhn, A.: Understanding, optimising, and extending data compression with anisotropic diffusion. Int. J. Comput. Vis. 108(3), 222–240 (2014)

    Article  MathSciNet  Google Scholar 

  28. Schönlieb, C.B.: Partial Differential Equation Methods for Image Inpainting, Cambridge Monographs on Applied and Computational Mathematics, vol. 29. Cambridge University Press, Cambridge (2015)

    Book  Google Scholar 

  29. Schuller, G.D., Yu, B., Huang, D., Edler, B.: Perceptual audio coding using adaptive pre-and post-filters and lossless compression. IEEE Trans. Speech Audio Process. 10(6), 379–390 (2002)

    Article  Google Scholar 

  30. Spanias, A., Painter, T., Atti, V.: Audio Signal Processing and Coding. Wiley, Hoboken (2006)

    Google Scholar 

  31. Taubman, D.S., Marcellin, M.W. (eds.): JPEG 2000: Image Compression Fundamentals, Standards and Practice. Kluwer, Boston (2002)

    Google Scholar 

  32. Vincent, E., Plumbley, M.D.: Low bit-rate object coding of musical audio using Bayesian harmonic models. IEEE Trans. Audio Speech Lang. Process. 15(4), 1273–1282 (2007)

    Article  Google Scholar 

  33. Weickert, J.: Theoretical foundations of anisotropic diffusion in image processing. In: Kropatsch, W., Klette, R., Solina, F., Albrecht, R. (eds.) Theoretical Foundations of Computer Vision, Computing Supplement, vol. 11, pp. 221–236. Springer, Vienna (1996). https://doi.org/10.1007/978-3-7091-6586-7_13

    Chapter  Google Scholar 

  34. Xiph.Org Foundation: Vorbis I specification (2015). https://xiph.org/vorbis/doc/Vorbis_I_spec.html

Download references

Acknowledgements

We thank Jan Østergaard (Aalborg University) for the valuable discussions that allowed us to improve our work. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 741215, ERC Advanced Grant INCOVID).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pascal Peter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Peter, P., Contelly, J., Weickert, J. (2019). Compressing Audio Signals with Inpainting-Based Sparsification. In: Lellmann, J., Burger, M., Modersitzki, J. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2019. Lecture Notes in Computer Science(), vol 11603. Springer, Cham. https://doi.org/10.1007/978-3-030-22368-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22368-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22367-0

  • Online ISBN: 978-3-030-22368-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics