DSD: document sparse-based denoising algorithm

  • T. H. Do
  • O. Ramos Terrades
  • S. Tabbone
Short Paper


In this paper, we present a sparse-based denoising algorithm for scanned documents. This method can be applied to any kind of scanned documents with satisfactory results. Unlike other approaches, the proposed approach encodes noise documents through sparse representation and visual dictionary learning techniques without any prior noise model. Moreover, we propose a precision parameter estimator. Experiments on several datasets demonstrate the robustness of the proposed approach compared to the state-of-the-art methods on document denoising.


Document denoising Sparse representations Sparse dictionary learning Document degradation models 



This work was partially supported by the European project SCANPLAN (A0806017L), the Spanish ConCORDIA Project (TIN2015-70924-C2-2-R) and the Vietnam National University, Hanoi (VNU) under project number QG.18.04.


  1. 1.
    Aharon M, Elad M, Bruckstein A (2006) K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. Sig Process 54(11):4311–4322CrossRefzbMATHGoogle Scholar
  2. 2.
    Barney E (2008) Modeling image degradations for improving OCR. In: Proceedings of the 16th European signal processing conference (EUSIPCO), pp 1–5Google Scholar
  3. 3.
    Candés EJ, Donoho DL (2000) Curvelets: a surprisingly effective nonadaptive representation for objects with edges. In: Rabut C, Cohen A, Schumaker L (eds) Curve and Surface Fitting: Saint-Malo 1999 (Innovations in Applied Mathematics), Vanderbilt University Press,  pp 105–120Google Scholar
  4. 4.
    Chatterjee P, Milanfar P (2010) Is denoising dead? Trans Image Process 19(4):895–911MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Daubechies I, Devore R, Fornasier M, Gunturk CS (2009) Iteratively reweighted least squares minimization for sparse recovery. Commun Pure Appl Math 63(1):1–38MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Do M, Vetterli M (2005) The contourlet transform: an efficient directional multiresolution image representation. Image Process 14(12):2091–2106CrossRefGoogle Scholar
  8. 8.
    Do TH (2014) Sparse representations over learned dictionary for document analysis. PhD thesis, Université de LorraineGoogle Scholar
  9. 9.
    Dong W, Zhang L, Shi G, Li X (2013) Nonlocally centralized sparse representation for image restoration. IEEE Trans Image Process 22(4):1620–1630MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Donoho D, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via \(\ell\)1 minimization. PNAS 100(5):2197–2202MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Donoho DL (1999) Wedgelets: nearly minimax estimation of edges. Ann Stat 27(3):782–1117MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Dosch P, Valveny P (2005) Report on the second symbol recognition contest. In: Liu W, Lladós J (ed) Graphics recognition. Ten years review and future perspectives, volume 3926 of Lecture notes in computer science, Springer, pp 381–397Google Scholar
  13. 13.
    Eksioglu EM (2014) Online dictionary learning algorithm with periodic updates and its application to image denoising. Expert Syst Appl 41:3682–3690CrossRefGoogle Scholar
  14. 14.
    Elad M (2010) Sparse and redundant representation: from theory to applications in signal and images processing. Springer, New YorkCrossRefzbMATHGoogle Scholar
  15. 15.
    Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. Image Process 54(12):3736–3745MathSciNetCrossRefGoogle Scholar
  16. 16.
    Engan K, Skretting K, Husoy JH (2007) Family of iterative LS-based dictionary learning algorithm, ITS-DLA, for sparse signal representation. Digit Signal Proc 17(1):32–49CrossRefGoogle Scholar
  17. 17.
    Eslami R, Radha H (2003) The contourlet transform for image de-noising using cycle spinning. In: Proceedings of Asilomar conference on signals, systems, and computers, pp 1982–1986Google Scholar
  18. 18.
    Gatos B, Ntirogiannis K, Pratikakis I (2011) DIBCO 2009: document image binarization contest. Int J Doc Anal Recognit 14(1):35–44CrossRefGoogle Scholar
  19. 19.
    Gonzalez I, Rao B (1997) Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm. Sig Process 45(3):600–616CrossRefGoogle Scholar
  20. 20.
    Hamza AB, Luque P, Martinez J, Roman R (1999) Removing noise and preserving details with relaxed median filters. Math Imag Vis 11(2):161–177MathSciNetCrossRefGoogle Scholar
  21. 21.
    Hardie RC, Barner KE (1994) Rank conditioned rank selection filters for signal restoration. Image Process 3:192–206CrossRefGoogle Scholar
  22. 22.
    Hernandez-Sabate A, Gil D, Roche D, Matsumoto M, Furuie S (2012) Inferring the performance of medical imaging algorithms. In: 14th International conference on computer analysis of images and patterns, vol 6854. pp 520–528Google Scholar
  23. 23.
    Hoang T, Barney E, Tabbone S (2011) Edge noise removal in bilevel graphical document images using sparse representation. In: Proceedings of the international conference on image processing, pp 3610–3613Google Scholar
  24. 24.
    Hoang TV, Smith EHB, Tabbone S (2014) Sparsity-based edge noise removal from bilevel graphical document images. IJDAR 17(2):161–179CrossRefGoogle Scholar
  25. 25.
    Jain AK (1989) Fundamentals of digital image processing. Prentice-Hall, Upper Saddle RiverzbMATHGoogle Scholar
  26. 26.
    Kanungo T, Haralick RM, Phillips IT (1993) Global and local document degradation models. In: Proceedings of the second international conference on document analysis and recognition, pp 730–734Google Scholar
  27. 27.
    Kuang Y, Zhang L, Yi Z (2014) An adaptive rank-sparsity K-SVD algorithm for image sequence denoising. Pattern Recogn Lett 45:46–54CrossRefGoogle Scholar
  28. 28.
    Lewis D, Agam G, Argamon S, Frieder O, Grossman D, Heard J (2006) Building a test collection for complex document information processing. In: Proceedings of 29th annual international ACM SIGIR conference, pp 665–666Google Scholar
  29. 29.
    Liu J, Wang Y, Su K, He W (2016) Image denoising with multidirectional shrinkage in directionlet domain. Sig Process 125:64–78CrossRefGoogle Scholar
  30. 30.
    Mallat S (2009) A wavelet tour of signal processing: The sparse way, third edn. Academic Press, CambridgezbMATHGoogle Scholar
  31. 31.
    Mallat SG, Zhang Z (1993) Matching pursuits with time-frequency dictionaries. Sig Process 41(12):3397–3415CrossRefzbMATHGoogle Scholar
  32. 32.
    Marial J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In: 26th Annual international conference on machine learning, pp 689–696Google Scholar
  33. 33.
    Om H, Biswas M (2014) MMSE based map estimation for image denoising. Opt Laser Technol 57:252–264CrossRefGoogle Scholar
  34. 34.
    Pati Y, Rezaiifar R, Krishnaprasad P (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: 27th Annual Asilomar conference on signals, systems, and computers, pp 40–44Google Scholar
  35. 35.
    Le Pennec E, Mallat S (2005) Sparse geometric image representations with bandelets. Image Process 14(4):423–438MathSciNetCrossRefGoogle Scholar
  36. 36.
    Peyré G, Mallat S (2007) A review of bandlet methods for geometrical image representation. Numer Algorithms 44(3):205–234MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Sadreazami H, Omair Ahmad M, Swamy MNS (2016) A study on image denoising in contourlet domain using the alpha-stable family of distributions. Sig Process 128:459–473CrossRefGoogle Scholar
  38. 38.
    Skretting K, Engan K (2010) Recursive least squares dictionary learning algorithm. Sig Process 58(4):2121–2130MathSciNetCrossRefGoogle Scholar
  39. 39.
    Starck J-L, Candés EJ, Donoho DL (2002) The curvelet transform for image denoising. Image Process 11(6):670–684MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Sun D, Gao Q, Lu Y, Huang Z, Li T (2014) A novel image denoising algorithm using linear Bayesian map estimation based on sparse representation. Sig Process 100:132–145CrossRefGoogle Scholar
  41. 41.
    Temlyakov VN (2000) Weak greedy algorithms. Adv Comput Math 12(2–3):213–227MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Yang R, Yin L, Gabbouj M, Astola J, Neuvo Y (1995) Optimal weighted median filters under structural constraints. Sig Process 43:591–604CrossRefGoogle Scholar
  43. 43.
    Zha Z, Zhang X, Wang Q, Bai Y, Chen Y, Tang L, Liu X (2018) Group sparsity residual constraint for image denoising with external nonlocal self-similarity prior. Neurocomputing 275:2294–2306CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department Informatics, Faculty of Mathematics Mechanics InformaticsVNU University of ScienceHanoiVietnam
  2. 2.Computer Vision Center, Computer Science Department, Engineering SchoolUniversitat Autònoma de BarcelonaBellaterraSpain
  3. 3.LORIA - UMR 7503Université de LorraineVandoeuvre-lès-NancyFrance

Personalised recommendations